"speech synthesis" Papers

11 papers found

Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

Zhen Ye, Peiwen Sun, Jiahe Lei et al.

AAAI 2025paperarXiv:2408.17175
75
citations

E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis

Zhisheng Zhang, Derui Wang, Yifan Mi et al.

NEURIPS 2025arXiv:2511.07099

FaceSpeak: Expressive and High-Quality Speech Synthesis from Human Portraits of Different Styles

Tian-Hao Zhang, Jiawei Zhang, Jun Wang et al.

AAAI 2025paperarXiv:2501.03181
2
citations

ProsodyFM: Unsupervised Phrasing and Intonation Control for Intelligible Speech Synthesis

Xiangheng He, Junjie Chen, Zixing Zhang et al.

AAAI 2025paperarXiv:2412.11795
1
citations

VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models

Kim Sung-Bin, Jeongsoo Choi, Puyuan Peng et al.

ICCV 2025arXiv:2504.02386
6
citations

ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis

Jungil Kong, Junmo Lee, Jeongmin Kim et al.

ICML 2024arXiv:2311.11745
3
citations

Scaling Speech Technology to 1,000+ Languages

Vineel Pratap Konduru, Andros Tjandra, Bowen Shi et al.

ICML 2024

SECap: Speech Emotion Captioning with Large Language Model

Yaoxun Xu, Hangting Chen, Jianwei Yu et al.

AAAI 2024paperarXiv:2312.10381
58
citations

SelfVC: Voice Conversion With Iterative Refinement using Self Transformations

Paarth Neekhara, Shehzeen Hussain, Rafael Valle et al.

ICML 2024arXiv:2310.09653
7
citations

UniAudio: Towards Universal Audio Generation with Large Language Models

Dongchao Yang, Jinchuan Tian, Xu Tan et al.

ICML 2024

What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection

XiaoHui Zhang, Jiangyan Yi, Chenglong Wang et al.

AAAI 2024paperarXiv:2312.09651
43
citations