"cross-modal retrieval" Papers
34 papers found
Conference
Beyond Walking: A Large-Scale Image-Text Benchmark for Text-based Person Anomaly Search
Shuyu Yang, Yaxiong Wang, Li Zhu et al.
CellCLIP - Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning
MingYu Lu, Ethan Weinberger, Chanwoo Kim et al.
Cross-modal Ship Re-Identification via Optical and SAR Imagery: A Novel Dataset and Method
Han Wang, Shengyang Li, Jian Yang et al.
CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features
Po-han Li, Sandeep Chinchali, ufuk topcu
Dynamic Masking and Auxiliary Hash Learning for Enhanced Cross-Modal Retrieval
Shuang Zhang, Yue Wu, Lei Shi et al.
Escaping Plato's Cave: Towards the Alignment of 3D and Text Latent Spaces
Souhail Hadgi, Luca Moschella, Andrea Santilli et al.
Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment
Pengfei Zhao, Rongbo Luan, Wei Zhang et al.
Learning Shared Representations from Unpaired Data
Amitai Yacobi, Nir Ben-Ari, Ronen Talmon et al.
Lightweight Contrastive Distilled Hashing for Online Cross-modal Retrieval
Jiaxing Li, Lin Jiang, Zeqi Ma et al.
Limitations in Employing Natural Language Supervision for Sensor-Based Human Activity Recognition - And Ways to Overcome Them
Harish Haresamudram, Apoorva Beedu, Mashfiqui Rabbi et al.
MM-EMBED: UNIVERSAL MULTIMODAL RETRIEVAL WITH MULTIMODAL LLMS
Sheng-Chieh Lin, Chankyu Lee, Mohammad Shoeybi et al.
MotionBind: Multi-Modal Human Motion Alignment for Retrieval, Recognition, and Generation
Kaleab Kinfu, Rene Vidal
Multimodal Unsupervised Domain Generalization by Retrieving Across the Modality Gap
Christopher Liao, Christian So, Theodoros Tsiligkaridis et al.
NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval
Zengrong Lin, Zheng Wang, Tianwen Qian et al.
Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval
Davide Caffagni, Sara Sarto, Marcella Cornia et al.
Robust Self-Paced Hashing for Cross-Modal Retrieval with Noisy Labels
Ruitao Pu, Yuan Sun, Yang Qin et al.
SEGA: Shaping Semantic Geometry for Robust Hashing under Noisy Supervision
Yiyang Gu, Bohan Wu, Qinghua Ran et al.
SensorLM: Learning the Language of Wearable Sensors
Yuwei Zhang, Kumar Ayush, Siyuan Qiao et al.
SIM: Surface-based fMRI Analysis for Inter-Subject Multimodal Decoding from Movie-Watching Experiments
Simon Dahan, Gabriel Bénédict, Logan Williams et al.
Test-time Adaptation for Cross-modal Retrieval with Query Shift
Haobin Li, Peng Hu, Qianjun Zhang et al.
Towards Cross-modal Backward-compatible Representation Learning for Vision-Language Models
Young Kyun Jang, Ser-Nam Lim
TULIP: Token-length Upgraded CLIP
Ivona Najdenkoska, Mohammad Mahdi Derakhshani, Yuki Asano et al.
Unbiased Prototype Consistency Learning for Multi-Modal and Multi-Task Object Re-Identification
Zhongao Zhou, Bin Yang, Wenke Huang et al.
Wills Aligner: Multi-Subject Collaborative Brain Visual Decoding
Guangyin Bao, Qi Zhang, Zixuan Gong et al.
An Empirical Study of CLIP for Text-Based Person Search
Cao Min, Yang Bai, ziyin Zeng et al.
CLIP-KD: An Empirical Study of CLIP Model Distillation
Chuanguang Yang, Zhulin An, Libo Huang et al.
Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning
Bang Yang, Yong Dai, Xuxin Cheng et al.
Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective
Fangzhou Song, Bin Zhu, Yanbin Hao et al.
Improving Medical Multi-modal Contrastive Learning with Expert Annotations
Yogesh Kumar, Pekka Marttinen
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation
Zhuohang Dang, Minnan Luo, Chengyou Jia et al.
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models
Samuele Poppi, Tobia Poppi, Federico Cocchi et al.
Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval
Jiamian Wang, Guohao Sun, Pichao Wang et al.
Tri-Modal Motion Retrieval by Learning a Joint Embedding Space
Kangning Yin, Shihao Zou, Yuxuan Ge et al.
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models
Yifei Ming, Sharon Li