Poster "multimodal integration" Papers
11 papers found
Conference
Cross-modal Associations in Vision and Language Models: Revisiting the Bouba-Kiki Effect
Tom Kouwenhoven, Kiana Shahrasbi, Tessa Verhoef
NEURIPS 2025arXiv:2507.10013
FinMMR: Make Financial Numerical Reasoning More Multimodal, Comprehensive, and Challenging
Zichen Tang, Haihong E, Jiacheng Liu et al.
ICCV 2025arXiv:2508.04625
6
citations
PARC: A Quantitative Framework Uncovering the Symmetries within Vision Language Models
Jenny Schmalfuss, Nadine Chang, Vibashan VS et al.
CVPR 2025arXiv:2506.14808
1
citations
Position: AI Should Sense Better, Not Just Scale Bigger: Adaptive Sensing as a Paradigm Shift
Eunsu Baek, Keondo Park, Jeonggil Ko et al.
NEURIPS 2025
3
citations
Reducing Hallucinations in Large Vision-Language Models via Latent Space Steering
Sheng Liu, Haotian Ye, James Y Zou
ICLR 2025
29
citations
scGeneScope: A Treatment-Matched Single Cell Imaging and Transcriptomics Dataset and Benchmark for Treatment Response Modeling
Joel Dapello, Marcel Nassar, Ridvan Eksi et al.
NEURIPS 2025
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
Sicong Leng, Yun Xing, Zesen Cheng et al.
NEURIPS 2025arXiv:2410.12787
30
citations
The Indra Representation Hypothesis
Jianglin Lu, Hailing Wang, Kuo Yang et al.
NEURIPS 2025
X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing
Xinyan Chen, Jianfei Yang
ICLR 2025arXiv:2410.10167
11
citations
Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations
Jeong Hun Yeo, Minsu Kim, Chae Won Kim et al.
ICCV 2025arXiv:2503.06273
5
citations
BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues
Sara Sarto, Marcella Cornia, Lorenzo Baraldi et al.
ECCV 2024arXiv:2407.20341
12
citations