Poster "multimodal video understanding" Papers
3 papers found
Conference
ConViS-Bench: Estimating Video Similarity Through Semantic Concepts
Benedetta Liberatori, Alessandro Conti, Lorenzo Vaquero et al.
NEURIPS 2025arXiv:2509.19245
1
citations
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
Xuehai He, Weixi Feng, Kaizhi Zheng et al.
ICLR 2025arXiv:2406.08407
36
citations
DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement
Hao Wu, Huabin Liu, Yu Qiao et al.
CVPR 2024arXiv:2404.02755
20
citations