"video question answering" Papers
57 papers found • Page 2 of 2
Conference
Vamos: Versatile Action Models for Video Understanding
Shijie Wang, Qi Zhao, Minh Quan et al.
ECCV 2024arXiv:2311.13627
36
citations
VideoCon: Robust Video-Language Alignment via Contrast Captions
Hritik Bansal, Yonatan Bitton, Idan Szpektor et al.
CVPR 2024arXiv:2311.10111
30
citations
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition
Hao Fei, Shengqiong Wu, Wei Ji et al.
ICML 2024oralarXiv:2501.03230
146
citations
VideoPrism: A Foundational Visual Encoder for Video Understanding
Long Zhao, Nitesh Bharadwaj Gundavarapu, Liangzhe Yuan et al.
ICML 2024arXiv:2402.13217
73
citations
Video Question Answering with Procedural Programs
Rohan Choudhury, Koichiro Niinuma, Kris Kitani et al.
ECCV 2024arXiv:2312.00937
37
citations
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
Guangzhi Sun, Wenyi Yu, Changli Tang et al.
ICML 2024oralarXiv:2406.15704
76
citations
YTCommentQA: Video Question Answerability in Instructional Videos
Saelyne Yang, Sunghyun Park, Yunseok Jang et al.
AAAI 2024paperarXiv:2401.17343
5
citations