"scene comprehension" Papers
3 papers found
Conference
From Easy to Hard: The MIR Benchmark for Progressive Interleaved Multi-Image Reasoning
Hang Du, Jiayang Zhang, Guoshun Nan et al.
ICCV 2025arXiv:2509.17040
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
Fanqing Meng, Jin Wang, Chuanhao Li et al.
ICLR 2025arXiv:2408.02718
48
citations
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition
Hao Fei, Shengqiong Wu, Wei Ji et al.
ICML 2024oralarXiv:2501.03230
146
citations