Poster "cross-modal interaction" Papers
10 papers found
Conference
3D-MolT5: Leveraging Discrete Structural Information for Molecule-Text Modeling
Qizhi Pei, Rui Yan, Kaiyuan Gao et al.
ICLR 2025arXiv:2406.05797
6
citations
CyIN: Cyclic Informative Latent Space for Bridging Complete and Incomplete Multimodal Learning
Ronghao Lin, Qiaolin He, Sijie Mai et al.
NEURIPS 2025arXiv:2602.04920
MUG: Pseudo Labeling Augmented Audio-Visual Mamba Network for Audio-Visual Video Parsing
Langyu Wang, Langyu Wang, Yingying Chen et al.
ICCV 2025arXiv:2507.01384
1
citations
Multimodal 3D Genome Pre-training
Minghao Yang, Pengteng Li, Yan Liang et al.
NEURIPS 2025arXiv:2504.09060
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Zhengyao Lyu, Tianlin Pan, Chenyang Si et al.
ICCV 2025arXiv:2506.07986
6
citations
Text-IRSTD: Leveraging Semantic Text to Promote Infrared Small Target Detection in Complex Scenes
Feng Huang, Shuyuan Zheng, Zhaobing Qiu et al.
ICCV 2025arXiv:2503.07249
1
citations
Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory
Sensen Gao, Xiaojun Jia, Xuhong Ren et al.
ECCV 2024arXiv:2403.12445
34
citations
KDProR: A Knowledge-Decoupling Probabilistic Framework for Video-Text Retrieval
Xianwei Zhuang, Hongxiang Li, Xuxin Cheng et al.
ECCV 2024
10
citations
Libra: Building Decoupled Vision System on Large Language Models
Yifan Xu, Xiaoshan Yang, Yaguang Song et al.
ICML 2024arXiv:2405.10140
10
citations
MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance
Yake Wei, Di Hu
ICML 2024arXiv:2405.17730
64
citations