"multi-modal llms" Papers
10 papers found
Conference
$F^3Set$: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos
Zhaoyu Liu, Kan Jiang, Murong Ma et al.
ICLR 2025oral
3
citations
AC-LoRA: (Almost) Training-Free Access Control Aware Multi-Modal LLMs
Lara Magdalena Lazier, Aritra Dhar, Vasilije Stambolic et al.
NEURIPS 2025arXiv:2505.11557
1
citations
AdaDARE-gamma: Balancing Stability and Plasticity in Multi-modal LLMs through Efficient Adaptation
Jingyi Xie, Jintao Yang, Zhunchen Luo et al.
CVPR 2025
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning
Yiwu Zhong, Zhuoming Liu, Yin Li et al.
ICCV 2025arXiv:2412.03248
24
citations
ALLVB: All-in-One Long Video Understanding Benchmark
Xichen Tan, Yuanjing Luo, Yunfan Ye et al.
AAAI 2025paperarXiv:2503.07298
6
citations
Can Multi-Modal LLMs Provide Live Step-by-Step Task Guidance?
Apratim Bhattacharyya, Bicheng Xu, Sanjay Haresh et al.
NEURIPS 2025arXiv:2511.21998
Distilling Multi-modal Large Language Models for Autonomous Driving
Deepti Hegde, Rajeev Yasarla, Hong Cai et al.
CVPR 2025arXiv:2501.09757
29
citations
The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs
HONG LI, Nanxi Li, Yuanjie Chen et al.
ICLR 2025arXiv:2410.01417
3
citations
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Chaoyou Fu, Yuhan Dai, Yongdong Luo et al.
CVPR 2025highlightarXiv:2405.21075
917
citations
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Renrui Zhang, Dongzhi Jiang, Yichi Zhang et al.
ECCV 2024arXiv:2403.14624
498
citations