"contrastive language-image pretraining" Papers

21 papers found

Filters:contrastive language-image pretraining Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

AmorLIP: Efficient Language-Image Pretraining via Amortization

Haotian Sun, Yitong Li, Yuchen Zhuang et al.

NEURIPS 2025arXiv:2505.18983

citations

Attribute-based Visual Reprogramming for Vision-Language Models

Chengyi Cai, Zesheng Ye, Lei Feng et al.

ICLR 2025arXiv:2501.13982

citations

DiffCLIP: Few-shot Language-driven Multimodal Classifier

Jiaqing Zhang, Mingxiang Cao, Xue Yang et al.

AAAI 2025paperarXiv:2412.07119

citations

Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation

Aishik Konwer, Zhijian Yang, Erhan Bas et al.

CVPR 2025arXiv:2503.04639

citations

Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment

Pengfei Zhao, Rongbo Luan, Wei Zhang et al.

NEURIPS 2025arXiv:2506.06970

citations

Kronecker Mask and Interpretive Prompts are Language-Action Video Learners

Jingyi Yang, Zitong YU, Nixiuming et al.

ICLR 2025oralarXiv:2502.03549

citations

Narrowing Information Bottleneck Theory for Multimodal Image-Text Representations Interpretability

Zhiyu Zhu, Zhibo Jin, Jiayu Zhang et al.

ICLR 2025arXiv:2502.14889

citations

ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models

Yassir Bendou, Amine Ouasfi, Vincent Gripon et al.

CVPR 2025arXiv:2501.11175

citations

Retaining Knowledge and Enhancing Long-Text Representations in CLIP through Dual-Teacher Distillation

Yuheng Feng, Changsong Wen, Zelin Peng et al.

CVPR 2025

Scaling Language-Free Visual Representation Learning

David Fan, Shengbang Tong, Jiachen Zhu et al.

ICCV 2025highlightarXiv:2504.01017

citations

Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling

Cristian Rodriguez-Opazo, Ehsan Abbasnejad, Damien Teney et al.

ICLR 2025arXiv:2405.17139

citations

Text and Image Are Mutually Beneficial: Enhancing Training-Free Few-Shot Classification with CLIP

Yayuan Li, Jintao Guo, Lei Qi et al.

AAAI 2025paperarXiv:2412.11375

citations

Vision-Language Models Do Not Understand Negation

Kumail Alhamoud, Shaden Alshammari, Yonglong Tian et al.

CVPR 2025arXiv:2501.09425

citations

Anchor-based Robust Finetuning of Vision-Language Models

Jinwei Han, Zhiwen Lin, Zhongyisun Sun et al.

CVPR 2024arXiv:2404.06244

citations

Concept-Guided Prompt Learning for Generalization in Vision-Language Models

Yi Zhang, Ce Zhang, Ke Yu et al.

AAAI 2024paperarXiv:2401.07457

citations

Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection

Chentao Cao, Zhun Zhong, Zhanke Zhou et al.

ICML 2024arXiv:2406.00806

citations

Expediting Contrastive Language-Image Pretraining via Self-Distilled Encoders

Bumsoo Kim, Jinhyung Kim, Yeonsik Jo et al.

AAAI 2024paperarXiv:2312.12659

citations

MLIP: Efficient Multi-Perspective Language-Image Pretraining with Exhaustive Data Utilization

Yu Zhang, Qi Zhang, Zixuan Gong et al.

ICML 2024arXiv:2406.01460

citations

MoDE: CLIP Data Experts via Clustering

Jiawei Ma, Po-Yao Huang, Saining Xie et al.

CVPR 2024arXiv:2404.16030

citations

OT-CLIP: Understanding and Generalizing CLIP via Optimal Transport

Liangliang Shi, Jack Fan, Junchi Yan

ICML 2024

Unveiling Typographic Deceptions: Insights of the Typographic Vulnerability in Large Vision-Language Models

Hao Cheng, Erjia Xiao, Jindong Gu et al.

ECCV 2024arXiv:2402.19150

citations