Poster "video understanding benchmark" Papers
2 papers found
Conference
Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing
Yudong Liu, Jingwei Sun, Yueqian Lin et al.
ICCV 2025arXiv:2503.10742
7
citations
Beyond MOT: Semantic Multi-Object Tracking
Yunhao Li, Qin Li, Hao Wang et al.
ECCV 2024arXiv:2403.05021
16
citations