Poster "dense captioning" Papers
2 papers found
Conference
Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs
Fangrui Zhu, Hanhui Wang, Yiming Xie et al.
NEURIPS 2025arXiv:2506.04220
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
Shuhuai Ren, Linli Yao, Shicheng Li et al.
CVPR 2024arXiv:2312.02051
372
citations