by Koichi Saito Papers
2 papers found
Conference
SoundCTM: Unifying Score-based and Consistency Models for Full-band Text-to-Sound Generation
Koichi Saito, Dongjun Kim, Takashi Shibuya et al.
ICLR 2025arXiv:2405.18503
10
citations
TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation
Jiaben Chen, Zixin Wang, AILING ZENG et al.
NEURIPS 2025arXiv:2510.07249
3
citations