Poster "text-to-speech synthesis" Papers
5 papers found
Conference
DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors
Keon Lee, Dong Won Kim, Jaehyeon Kim et al.
ICLR 2025arXiv:2406.11427
28
citations
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
Yuto Nishimura, Takumi Hirose, Masanari Ohi et al.
ICLR 2025arXiv:2410.04380
5
citations
Improved Sampling Algorithms for Lévy-Itô Diffusion Models
Vadim Popov, Assel Yermekova, Tasnima Sadekova et al.
ICLR 2025
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
Yuancheng Wang, Haoyue Zhan, Liwei Liu et al.
ICLR 2025arXiv:2409.00750
161
citations
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
Alexander Liu, Sang-gil Lee, Chao-Han Huck Yang et al.
ICLR 2025arXiv:2503.00733
4
citations