Data Science | Machine Learning with Python for Researchers
32.6K subscribers
3.31K photos
125 videos
23 files
3.52K links
ads: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT
Download Telegram
Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation

📝 Summary:
Ovi is a unified audio-video generation model using twin-DiT modules with blockwise cross-modal fusion. This innovative design ensures natural synchronization and high-quality multimodal outputs, simplifying previous multi-stage approaches.

🔹 Publication Date: Published on Sep 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.01284
• PDF: https://arxiv.org/pdf/2510.01284
• Project Page: https://aaxwaz.github.io/Ovi
• Github: https://github.com/character-ai/Ovi

🔹 Models citing this paper:
https://huggingface.co/chetwinlow1/Ovi
https://huggingface.co/rkfg/Ovi-fp8_quantized

Spaces citing this paper:
https://huggingface.co/spaces/akhaliq/Ovi
https://huggingface.co/spaces/deddytoyota/Ovi
https://huggingface.co/spaces/alexnasa/Ovi-ZEROGPU

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AudioVideoGeneration #MultimodalAI #DeepLearning #CrossModalFusion #AIResearch