Paper "vision-language tasks" Papers
4 papers found
Conference
Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning
Shengqiong Wu, Hao Fei, Liangming Pan et al.
AAAI 2025paperarXiv:2412.11124
19
citations
Cycle-Consistency Learning for Captioning and Grounding
Ning Wang, Jiajun Deng, Mingbo Jia
AAAI 2024paperarXiv:2312.15162
14
citations
FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning
Haokun Chen, Yao Zhang, Denis Krompass et al.
AAAI 2024paperarXiv:2308.12305
86
citations
VIGC: Visual Instruction Generation and Correction
Théo Delemazure, Jérôme Lang, Grzegorz Pierczyński
AAAI 2024paperarXiv:2308.12714
88
citations