ML Research Hub
32.5K subscribers
3.56K photos
154 videos
23 files
3.8K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho
Download Telegram
Media is too big
VIEW IN TELEGRAM
ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation

📝 Summary:
ViSAudio is an end-to-end framework that generates high-quality binaural spatial audio directly from silent video. It uses conditional flow matching and a dual-branch architecture, outperforming previous methods in immersion and consistency. The paper also introduces the BiAudio dataset for this ...

🔹 Publication Date: Published on Dec 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.03036
• PDF: https://arxiv.org/pdf/2512.03036
• Project Page: https://kszpxxzmc.github.io/ViSAudio-project/
• Github: https://github.com/kszpxxzmc/ViSAudio

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SpatialAudio #AudioGeneration #DeepLearning #ComputerVision #AI