Highlight "vision-language alignment" Papers
3 papers found
Conference
Assessing and Learning Alignment of Unimodal Vision and Language Models
Le Zhang, Qian Yang, Aishwarya Agrawal
CVPR 2025highlightarXiv:2412.04616
15
citations
Corvid: Improving Multimodal Large Language Models Towards Chain-of-Thought Reasoning
Jingjing Jiang, Chao Ma, Xurui Song et al.
ICCV 2025highlightarXiv:2507.07424
7
citations
SmartCLIP: Modular Vision-language Alignment with Identification Guarantees
Shaoan Xie, Lingjing Kong, Yujia Zheng et al.
CVPR 2025highlightarXiv:2507.22264
4
citations