"image-text matching" Papers

10 papers found

Aligning Information Capacity Between Vision and Language via Dense-to-Sparse Feature Distillation for Image-Text matching

Yang Liu, Wentao Feng, Zhuoyao Liu et al.

ICCV 2025arXiv:2503.14953
1
citations

C-CLIP: Multimodal Continual Learning for Vision-Language Model

Wenzhuo Liu, Fei Zhu, Longhui Wei et al.

ICLR 2025
13
citations

CLIP Under the Microscope: A Fine-Grained Analysis of Multi-Object Representation

Reza Abbasi, Ali Nazari, Aminreza Sefid et al.

CVPR 2025arXiv:2502.19842
6
citations

LLM-enhanced Action-aware Multi-modal Prompt Tuning for Image-Text Matching

Meng Tian, Shuo Yang, Xinxiao Wu

ICCV 2025arXiv:2506.23502
1
citations

MASS: Overcoming Language Bias in Image-Text Matching

Jiwan Chung, Seungwon Lim, Sangkyu Lee et al.

AAAI 2025paperarXiv:2501.11469

ViUniT: Visual Unit Tests for More Robust Visual Programming

Artemis Panagopoulou, Honglu Zhou, silvio savarese et al.

CVPR 2025arXiv:2412.08859
2
citations

FineMatch: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction

Hang Hua, Jing Shi, Kushal Kafle et al.

ECCV 2024arXiv:2404.14715
14
citations

Gradient-based Visual Explanation for Transformer-based CLIP

Chenyang ZHAO, Kun Wang, Xingyu Zeng et al.

ICML 2024

Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-Modal Structured Representations

Yufeng Huang, Jiji Tang, Zhuo Chen et al.

AAAI 2024paperarXiv:2305.06152
53
citations

SyncMask: Synchronized Attentional Masking for Fashion-centric Vision-Language Pretraining

Chull Hwan Song, Taebaek Hwang, Jooyoung Yoon et al.

CVPR 2024arXiv:2404.01156
11
citations