"modality gap" Papers
21 papers found
Conference
Bridging Sign and Spoken Languages: Pseudo Gloss Generation for Sign Language Translation
Jianyuan Guo, Peike Li, Trevor Cohn
Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations
Jeonghyeon Kim, Sangheum Hwang
Global Minimizers of Sigmoid Contrastive Loss
Kiril Bangachev, Guy Bresler, Iliyas Noman et al.
Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment
Pengfei Zhao, Rongbo Luan, Wei Zhang et al.
Learning Visual Proxy for Compositional Zero-Shot Learning
Shiyu Zhang, Cheng Yan, Yang Liu et al.
Mind the Gap: Preserving and Compensating for the Modality Gap in CLIP-Based Continual Learning
Linlan Huang, Xusheng Cao, Haori Lu et al.
Mitigate the Gap: Improving Cross-Modal Alignment in CLIP
Sedigheh Eslami, Gerard de Melo
Post-pre-training for Modality Alignment in Vision-Language Foundation Models
Shin'ya Yamaguchi, Dewei Feng, Sekitoshi Kanai et al.
Rebalancing Contrastive Alignment with Bottlenecked Semantic Increments in Text-Video Retrieval
Jian Xiao, Zijie Song, Jialong Hu et al.
Release the Powers of Prompt Tuning: Cross-Modality Prompt Transfer
Ningyuan Zhang, Jie Lu, Keqiuyin Li et al.
Superpowering Open-Vocabulary Object Detectors for X-ray Vision
Pablo Garcia-Fernandez, Lorenzo Vaquero, Mingxuan Liu et al.
Test-time Adaptation for Cross-modal Retrieval with Query Shift
Haobin Li, Peng Hu, Qianjun Zhang et al.
Toward Modality Gap: Vision Prototype Learning for Weakly-supervised Semantic Segmentation with CLIP
Zhongxing Xu, Feilong Tang, Zhe Chen et al.
ViewPoint: Panoramic Video Generation with Pretrained Diffusion Models
Zixun Fang, Kai Zhu, Zhiheng Liu et al.
DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval
Xiangpeng Yang, Linchao Zhu, Xiaohan Wang et al.
Improving Cross-Modal Alignment with Synthetic Pairs for Text-Only Image Captioning
Zhiyue Liu, Jinyuan Liu, Fanrong Ma
Improving Medical Multi-modal Contrastive Learning with Expert Annotations
Yogesh Kumar, Pekka Marttinen
Language-Driven Cross-Modal Classifier for Zero-Shot Multi-Label Image Recognition
Yicheng Liu, Jie Wen, Chengliang Liu et al.
Learning Modality Knowledge Alignment for Cross-Modality Transfer
Wenxuan Ma, Shuang Li, Lincan Cai et al.
SimDistill: Simulated Multi-Modal Distillation for BEV 3D Object Detection
Haimei Zhao, Qiming Zhang, Shanshan Zhao et al.
Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation
Xinyao Li, Yuke Li, Zhekai Du et al.