Highlight "video question answering" Papers
3 papers found
Conference
Can I Trust Your Answer? Visually Grounded Video Question Answering
Junbin Xiao, Angela Yao, Yicong Li et al.
CVPR 2024highlightarXiv:2309.01327
113
citations
Koala: Key Frame-Conditioned Long Video-LLM
Reuben Tan, Ximeng Sun, Ping Hu et al.
CVPR 2024highlightarXiv:2404.04346
64
citations
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Kunchang Li, Yali Wang, Yinan He et al.
CVPR 2024highlightarXiv:2311.17005
902
citations