Poster "video captioning" Papers
14 papers found
Conference
ARGUS: Hallucination and Omission Evaluation in Video-LLMs
Ruchit Rawal, Reza Shirkavand, Heng Huang et al.
ICCV 2025arXiv:2506.07371
4
citations
HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding
Shehreen Azad, Vibhav Vineet, Yogesh S. Rawat
CVPR 2025arXiv:2503.08585
13
citations
HyperGLM: HyperGraph for Video Scene Graph Generation and Anticipation
Trong-Thuan Nguyen, Pha Nguyen, Jackson Cothren et al.
CVPR 2025arXiv:2411.18042
9
citations
Modeling dynamic social vision highlights gaps between deep learning and humans
Kathy Garcia, Emalie McMahon, Colin Conwell et al.
ICLR 2025
Progress-Aware Video Frame Captioning
Zihui Xue, Joungbin An, Xitong Yang et al.
CVPR 2025arXiv:2412.02071
7
citations
Shot2Story: A New Benchmark for Comprehensive Understanding of Multi-shot Videos
Mingfei Han, Linjie Yang, Xiaojun Chang et al.
ICLR 2025arXiv:2312.10300
48
citations
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation
Wenhao Wang, Yi Yang
NEURIPS 2025arXiv:2503.01739
11
citations
Beyond MOT: Semantic Multi-Object Tracking
Yunhao Li, Qin Li, Hao Wang et al.
ECCV 2024arXiv:2403.05021
16
citations
COM Kitchens: An Unedited Overhead-view Procedural Videos Dataset a Vision-Language Benchmark
Atsushi Hashimoto, Koki Maeda, Tosho Hirasawa et al.
ECCV 2024
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
Nina Shvetsova, Anna Kukleva, Xudong Hong et al.
ECCV 2024arXiv:2310.04900
33
citations
Learning Video Context as Interleaved Multimodal Sequences
Qinghong Lin, Pengchuan Zhang, Difei Gao et al.
ECCV 2024arXiv:2407.21757
12
citations
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace et al.
CVPR 2024arXiv:2402.19479
351
citations
UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling
Haoyu Lu, Yuqi Huo, Guoxing Yang et al.
ICLR 2024arXiv:2302.06605
55
citations
Video ReCap: Recursive Captioning of Hour-Long Videos
Md Mohaiminul Islam, Vu Bao Ngan Ho, Xitong Yang et al.
CVPR 2024arXiv:2402.13250
85
citations