✨IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
📝 Summary:
IndexTTS enhances XTTS and Tortoise for TTS, improving naturalness and zero-shot voice cloning. It features hybrid character-pinyin modeling for Chinese and optimized vector quantization, resulting in more controllable usage, faster inference, and superior performance compared to other systems.
🔹 Publication Date: Published on Feb 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.05512
• PDF: https://arxiv.org/pdf/2502.05512
• Github: https://github.com/index-tts/index-tts
🔹 Models citing this paper:
• https://huggingface.co/IndexTeam/IndexTTS-2
• https://huggingface.co/IndexTeam/Index-TTS
• https://huggingface.co/Toxzic/indextts-colab
✨ Spaces citing this paper:
• https://huggingface.co/spaces/IndexTeam/IndexTTS
• https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena
• https://huggingface.co/spaces/jairwaal/image
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#TextToSpeech #ZeroShotLearning #VoiceCloning #AI #MachineLearning
📝 Summary:
IndexTTS enhances XTTS and Tortoise for TTS, improving naturalness and zero-shot voice cloning. It features hybrid character-pinyin modeling for Chinese and optimized vector quantization, resulting in more controllable usage, faster inference, and superior performance compared to other systems.
🔹 Publication Date: Published on Feb 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.05512
• PDF: https://arxiv.org/pdf/2502.05512
• Github: https://github.com/index-tts/index-tts
🔹 Models citing this paper:
• https://huggingface.co/IndexTeam/IndexTTS-2
• https://huggingface.co/IndexTeam/Index-TTS
• https://huggingface.co/Toxzic/indextts-colab
✨ Spaces citing this paper:
• https://huggingface.co/spaces/IndexTeam/IndexTTS
• https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena
• https://huggingface.co/spaces/jairwaal/image
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#TextToSpeech #ZeroShotLearning #VoiceCloning #AI #MachineLearning
arXiv.org
IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot...
Recently, large language model (LLM) based text-to-speech (TTS) systems have gradually become the mainstream in the industry due to their high naturalness and powerful zero-shot voice cloning...