"vision encoders" Papers
7 papers found
Conference
Apollo: An Exploration of Video Understanding in Large Multimodal Models
Orr Zohar, Xiaohan Wang, Yann Dubois et al.
CVPR 2025arXiv:2412.10360
55
citations
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Min Shi, Fuxiao Liu, Shihao Wang et al.
ICLR 2025arXiv:2408.15998
116
citations
Evaluating Vision-Language Models as Evaluators in Path Planning
Mohamed Aghzal, Xiang Yue, Erion Plaku et al.
CVPR 2025arXiv:2411.18711
4
citations
Multimodal Autoregressive Pre-training of Large Vision Encoders
Enrico Fini, Mustafa Shukor, Xiujun Li et al.
CVPR 2025highlightarXiv:2411.14402
77
citations
ParGo: Bridging Vision-Language with Partial and Global Views
An-Lan Wang, Bin Shan, Wei Shi et al.
AAAI 2025paperarXiv:2408.12928
25
citations
Scaling Language-Free Visual Representation Learning
David Fan, Shengbang Tong, Jiachen Zhu et al.
ICCV 2025highlightarXiv:2504.01017
41
citations
Stealthy Backdoor Attack in Self-Supervised Learning Vision Encoders for Large Vision Language Models
Zhaoyi Liu, Huan Zhang
CVPR 2025arXiv:2502.18290
9
citations