"multi-modal llms" Papers

10 papers found

$F^3Set$: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos

Zhaoyu Liu, Kan Jiang, Murong Ma et al.

ICLR 2025oral
3
citations

AC-LoRA: (Almost) Training-Free Access Control Aware Multi-Modal LLMs

Lara Magdalena Lazier, Aritra Dhar, Vasilije Stambolic et al.

NEURIPS 2025arXiv:2505.11557
1
citations

AdaDARE-gamma: Balancing Stability and Plasticity in Multi-modal LLMs through Efficient Adaptation

Jingyi Xie, Jintao Yang, Zhunchen Luo et al.

CVPR 2025

AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning

Yiwu Zhong, Zhuoming Liu, Yin Li et al.

ICCV 2025arXiv:2412.03248
24
citations

ALLVB: All-in-One Long Video Understanding Benchmark

Xichen Tan, Yuanjing Luo, Yunfan Ye et al.

AAAI 2025paperarXiv:2503.07298
6
citations

Can Multi-Modal LLMs Provide Live Step-by-Step Task Guidance?

Apratim Bhattacharyya, Bicheng Xu, Sanjay Haresh et al.

NEURIPS 2025arXiv:2511.21998

Distilling Multi-modal Large Language Models for Autonomous Driving

Deepti Hegde, Rajeev Yasarla, Hong Cai et al.

CVPR 2025arXiv:2501.09757
29
citations

The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs

HONG LI, Nanxi Li, Yuanjie Chen et al.

ICLR 2025arXiv:2410.01417
3
citations

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Chaoyou Fu, Yuhan Dai, Yongdong Luo et al.

CVPR 2025highlightarXiv:2405.21075
917
citations

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Renrui Zhang, Dongzhi Jiang, Yichi Zhang et al.

ECCV 2024arXiv:2403.14624
498
citations