"speech synthesis" Papers
11 papers found
Conference
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
Zhen Ye, Peiwen Sun, Jiahe Lei et al.
AAAI 2025paperarXiv:2408.17175
75
citations
E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis
Zhisheng Zhang, Derui Wang, Yifan Mi et al.
NEURIPS 2025arXiv:2511.07099
FaceSpeak: Expressive and High-Quality Speech Synthesis from Human Portraits of Different Styles
Tian-Hao Zhang, Jiawei Zhang, Jun Wang et al.
AAAI 2025paperarXiv:2501.03181
2
citations
ProsodyFM: Unsupervised Phrasing and Intonation Control for Intelligible Speech Synthesis
Xiangheng He, Junjie Chen, Zixing Zhang et al.
AAAI 2025paperarXiv:2412.11795
1
citations
VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models
Kim Sung-Bin, Jeongsoo Choi, Puyuan Peng et al.
ICCV 2025arXiv:2504.02386
6
citations
ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis
Jungil Kong, Junmo Lee, Jeongmin Kim et al.
ICML 2024arXiv:2311.11745
3
citations
Scaling Speech Technology to 1,000+ Languages
Vineel Pratap Konduru, Andros Tjandra, Bowen Shi et al.
ICML 2024
SECap: Speech Emotion Captioning with Large Language Model
Yaoxun Xu, Hangting Chen, Jianwei Yu et al.
AAAI 2024paperarXiv:2312.10381
58
citations
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations
Paarth Neekhara, Shehzeen Hussain, Rafael Valle et al.
ICML 2024arXiv:2310.09653
7
citations
UniAudio: Towards Universal Audio Generation with Large Language Models
Dongchao Yang, Jinchuan Tian, Xu Tan et al.
ICML 2024
What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection
XiaoHui Zhang, Jiangyan Yi, Chenglong Wang et al.
AAAI 2024paperarXiv:2312.09651
43
citations