"video understanding benchmark" Papers
3 papers found
Conference
EgoExoBench: A Benchmark for First- and Third-person View Video Understanding in MLLMs
Yuping He, Yifei Huang, Guo Chen et al.
NEURIPS 2025oralarXiv:2507.18342
11
citations
Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing
Yudong Liu, Jingwei Sun, Yueqian Lin et al.
ICCV 2025arXiv:2503.10742
7
citations
Beyond MOT: Semantic Multi-Object Tracking
Yunhao Li, Qin Li, Hao Wang et al.
ECCV 2024arXiv:2403.05021
16
citations