"video-text understanding" Papers
4 papers found
Conference
DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding
Yue Jiang, Jichu Li, Yang Liu et al.
NEURIPS 2025oralarXiv:2505.18411
5
citations
Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning
Fanrui Zhang, Dian Li, Qiang Zhang et al.
NEURIPS 2025arXiv:2505.16836
4
citations
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Ishaan Rawal, Alexander Matyasko, Shantanu Jaiswal et al.
ICML 2024arXiv:2306.08889
8
citations
DoRA: Weight-Decomposed Low-Rank Adaptation
Shih-Yang Liu, Chien-Yi Wang, Hongxu Yin et al.
ICML 2024arXiv:2402.09353
706
citations