"multimodal foundation models" Papers
9 papers found
Conference
An Evidence-Based Post-Hoc Adjustment Framework for Anomaly Detection Under Data Contamination
Sukanya Patra, Souhaib Ben Taieb
NEURIPS 2025spotlightarXiv:2510.21296
Enhancing Cognition and Explainability of Multimodal Foundation Models with Self-Synthesized Data
Yucheng Shi, Quanzheng Li, Jin Sun et al.
ICLR 2025arXiv:2502.14044
8
citations
Low-Biased General Annotated Dataset Generation
Dengyang Jiang, Haoyu Wang, Lei Zhang et al.
CVPR 2025arXiv:2412.10831
MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models
Hengzhi Li, Megan Tjandrasuwita, Yi R. (May) Fung et al.
NEURIPS 2025arXiv:2502.16671
8
citations
MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models
Chejian Xu, Jiawei Zhang, Zhaorun Chen et al.
ICLR 2025arXiv:2503.14827
11
citations
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
Zhengfeng Lai, Vasileios Saveris, Chen Chen et al.
ICLR 2025arXiv:2410.02740
9
citations
TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models
Ziyao Shangguan, Chuhan Li, Yuxuan Ding et al.
ICLR 2025oralarXiv:2410.23266
37
citations
VITRIX-UniViTAR: Unified Vision Transformer with Native Resolution
Limeng Qiao, Yiyang Gan, Bairui Wang et al.
NEURIPS 2025oral
3
citations
Libra: Building Decoupled Vision System on Large Language Models
Yifan Xu, Xiaoshan Yang, Yaguang Song et al.
ICML 2024arXiv:2405.10140
10
citations