Poster "cross-modal retrieval" Papers

22 papers found

CellCLIP - Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning

MingYu Lu, Ethan Weinberger, Chanwoo Kim et al.

NEURIPS 2025arXiv:2506.06290
2
citations

Cross-modal Ship Re-Identification via Optical and SAR Imagery: A Novel Dataset and Method

Han Wang, Shengyang Li, Jian Yang et al.

ICCV 2025arXiv:2506.22027
6
citations

CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features

Po-han Li, Sandeep Chinchali, ufuk topcu

ICLR 2025arXiv:2410.07610
5
citations

Dynamic Masking and Auxiliary Hash Learning for Enhanced Cross-Modal Retrieval

Shuang Zhang, Yue Wu, Lei Shi et al.

NEURIPS 2025

Escaping Plato's Cave: Towards the Alignment of 3D and Text Latent Spaces

Souhail Hadgi, Luca Moschella, Andrea Santilli et al.

CVPR 2025arXiv:2503.05283
2
citations

Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment

Pengfei Zhao, Rongbo Luan, Wei Zhang et al.

NEURIPS 2025arXiv:2506.06970
1
citations

Learning Shared Representations from Unpaired Data

Amitai Yacobi, Nir Ben-Ari, Ronen Talmon et al.

NEURIPS 2025arXiv:2505.21524

MM-EMBED: UNIVERSAL MULTIMODAL RETRIEVAL WITH MULTIMODAL LLMS

Sheng-Chieh Lin, Chankyu Lee, Mohammad Shoeybi et al.

ICLR 2025arXiv:2411.02571
86
citations

Multimodal Unsupervised Domain Generalization by Retrieving Across the Modality Gap

Christopher Liao, Christian So, Theodoros Tsiligkaridis et al.

ICLR 2025arXiv:2402.04416
1
citations

NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval

Zengrong Lin, Zheng Wang, Tianwen Qian et al.

CVPR 2025arXiv:2503.10526
2
citations

Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval

Davide Caffagni, Sara Sarto, Marcella Cornia et al.

CVPR 2025arXiv:2503.01980
6
citations

SEGA: Shaping Semantic Geometry for Robust Hashing under Noisy Supervision

Yiyang Gu, Bohan Wu, Qinghua Ran et al.

NEURIPS 2025

SensorLM: Learning the Language of Wearable Sensors

Yuwei Zhang, Kumar Ayush, Siyuan Qiao et al.

NEURIPS 2025arXiv:2506.09108
19
citations

SIM: Surface-based fMRI Analysis for Inter-Subject Multimodal Decoding from Movie-Watching Experiments

Simon Dahan, Gabriel Bénédict, Logan Williams et al.

ICLR 2025arXiv:2501.16471
3
citations

Test-time Adaptation for Cross-modal Retrieval with Query Shift

Haobin Li, Peng Hu, Qianjun Zhang et al.

ICLR 2025arXiv:2410.15624
9
citations

Towards Cross-modal Backward-compatible Representation Learning for Vision-Language Models

Young Kyun Jang, Ser-Nam Lim

ICCV 2025arXiv:2405.14715
2
citations

TULIP: Token-length Upgraded CLIP

Ivona Najdenkoska, Mohammad Mahdi Derakhshani, Yuki Asano et al.

ICLR 2025arXiv:2410.10034
17
citations

CLIP-KD: An Empirical Study of CLIP Model Distillation

Chuanguang Yang, Zhulin An, Libo Huang et al.

CVPR 2024arXiv:2307.12732
86
citations

Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective

Fangzhou Song, Bin Zhu, Yanbin Hao et al.

ECCV 2024arXiv:2312.04763
10
citations

Improving Medical Multi-modal Contrastive Learning with Expert Annotations

Yogesh Kumar, Pekka Marttinen

ECCV 2024arXiv:2403.10153
23
citations

Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models

Samuele Poppi, Tobia Poppi, Federico Cocchi et al.

ECCV 2024arXiv:2311.16254
10
citations

Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models

Yifei Ming, Sharon Li

ICML 2024arXiv:2405.01468
10
citations