"video captioning" Papers
17 papers found
Conference
ARGUS: Hallucination and Omission Evaluation in Video-LLMs
Ruchit Rawal, Reza Shirkavand, Heng Huang et al.
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Zhuoyi Yang, Jiayan Teng, Wendi Zheng et al.
HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding
Shehreen Azad, Vibhav Vineet, Yogesh S. Rawat
HyperGLM: HyperGraph for Video Scene Graph Generation and Anticipation
Trong-Thuan Nguyen, Pha Nguyen, Jackson Cothren et al.
Modeling dynamic social vision highlights gaps between deep learning and humans
Kathy Garcia, Emalie McMahon, Colin Conwell et al.
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding
Jang Hyun Cho, Andrea Madotto, Effrosyni Mavroudi et al.
Progress-Aware Video Frame Captioning
Zihui Xue, Joungbin An, Xitong Yang et al.
Shot2Story: A New Benchmark for Comprehensive Understanding of Multi-shot Videos
Mingfei Han, Linjie Yang, Xiaojun Chang et al.
VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Captioning
Ji Soo Lee, Jongha Kim, Jeehye Na et al.
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation
Wenhao Wang, Yi Yang
Beyond MOT: Semantic Multi-Object Tracking
Yunhao Li, Qin Li, Hao Wang et al.
COM Kitchens: An Unedited Overhead-view Procedural Videos Dataset a Vision-Language Benchmark
Atsushi Hashimoto, Koki Maeda, Tosho Hirasawa et al.
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
Nina Shvetsova, Anna Kukleva, Xudong Hong et al.
Learning Video Context as Interleaved Multimodal Sequences
Qinghong Lin, Pengchuan Zhang, Difei Gao et al.
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace et al.
UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling
Haoyu Lu, Yuqi Huo, Guoxing Yang et al.
Video ReCap: Recursive Captioning of Hour-Long Videos
Md Mohaiminul Islam, Vu Bao Ngan Ho, Xitong Yang et al.