Paper "multimodal alignment" Papers
5 papers found
Conference
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning
Wenwen Zhuang, Xin Huang, Xiantao Zhang et al.
AAAI 2025paperarXiv:2408.08640
60
citations
Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy for Temporal Sentence Grounding in Video
Zhaobo Qi, Yibo Yuan, Xiaowen Ruan et al.
AAAI 2024paperarXiv:2401.07567
15
citations
TiMix: Text-Aware Image Mixing for Effective Vision-Language Pre-training
Chaoya Jiang, Wei Ye, Haiyang Xu et al.
AAAI 2024paperarXiv:2312.08846
6
citations
Towards More Faithful Natural Language Explanation Using Multi-Level Contrastive Learning in VQA
Chengen Lai, Shengli Song, Shiqi Meng et al.
AAAI 2024paperarXiv:2312.13594
10
citations
V2Meow: Meowing to the Visual Beat via Video-to-Music Generation
Kun Su, Judith Li, Qingqing Huang et al.
AAAI 2024paperarXiv:2305.06594
24
citations