Poster "multimodal representation learning" Papers
19 papers found
Conference
Aligning Multimodal Representations through an Information Bottleneck
Antonio Almudévar, Jose Miguel Hernandez-Lobato, Sameer Khurana et al.
CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs
Jinlan Fu, Shenzhen Huangfu, Hao Fei et al.
CircuitFusion: Multimodal Circuit Representation Learning for Agile Chip Design
Wenji Fang, Shang Liu, Jing Wang et al.
Escaping Plato's Cave: Towards the Alignment of 3D and Text Latent Spaces
Souhail Hadgi, Luca Moschella, Andrea Santilli et al.
Gramian Multimodal Representation Learning and Alignment
Giordano Cicchetti, Eleonora Grassucci, Luigi Sigillo et al.
Learning Shared Representations from Unpaired Data
Amitai Yacobi, Nir Ben-Ari, Ronen Talmon et al.
Occlusion-aware Text-Image-Point Cloud Pretraining for Open-World 3D Object Recognition
Khanh Nguyen, Ghulam Mubashar Hassan, Ajmal Mian
The "Law'' of the Unconscious Contrastive Learner: Probabilistic Alignment of Unpaired Modalities
Yongwei Che, Benjamin Eysenbach
Understanding Co-speech Gestures in-the-wild
Sindhu Hegde, K R Prajwal, Taein Kwon et al.
Understanding the Gain from Data Filtering in Multimodal Contrastive Learning
Divyansh Pareek, Sewoong Oh, Simon Du
Better Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor Attacks
Wenhan Yang, Jingdong Gao, Baharan Mirzasoleiman
Contrastive ground-level image and remote sensing pre-training improves representation learning for natural world imagery
Andy V Huynh, Lauren Gillespie, Jael Lopez-Saucedo et al.
DreamDiffusion: High-Quality EEG-to-Image Generation with Temporal Masked Signal Modeling and CLIP Alignment
Yunpeng Bai, Xintao Wang, Yanpei Cao et al.
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
Nina Shvetsova, Anna Kukleva, Xudong Hong et al.
Learning Multimodal Latent Generative Models with Energy-Based Prior
Shiyu Yuan, Jiali Cui, Hanao Li et al.
MLIP: Efficient Multi-Perspective Language-Image Pretraining with Exhaustive Data Utilization
Yu Zhang, Qi Zhang, Zixuan Gong et al.
Multimodal Patient Representation Learning with Missing Modalities and Labels
Zhenbang Wu, Anant Dadu, Nicholas Tustison et al.
Towards Multimodal Open-Set Domain Generalization and Adaptation through Self-supervision
Hao Dong, Eleni Chatzi, Olga Fink
Unified Medical Image Pre-training in Language-Guided Common Semantic Space
Xiaoxuan He, Yifan Yang, Xinyang Jiang et al.