"vision-language capabilities" Papers
2 papers found
Conference
Mitigating Object Hallucination in MLLMs via Data-augmented Phrase-level Alignment
Pritam Sarkar, Sayna Ebrahimi, Ali Etemad et al.
ICLR 2025arXiv:2405.18654
22
citations
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Weihao Yu, Zhengyuan Yang, Linjie Li et al.
ICML 2024arXiv:2308.02490
1066
citations