"temporal video understanding" Papers
3 papers found
Conference
LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos
Tiantian Geng, Jinrui Zhang, Qingni Wang et al.
CVPR 2025arXiv:2411.19772
34
citations
EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval
Thomas Hummel, Shyamgopal Karthik, Mariana-Iuliana Georgescu et al.
ECCV 2024arXiv:2407.16658
19
citations
VideoCon: Robust Video-Language Alignment via Contrast Captions
Hritik Bansal, Yonatan Bitton, Idan Szpektor et al.
CVPR 2024arXiv:2311.10111
30
citations