"modality fusion" Papers
8 papers found
Conference
Brain Harmony: A Multimodal Foundation Model Unifying Morphology and Function into 1D Tokens
Zijian Dong, Ruilin Li, Joanna Chong et al.
NEURIPS 2025arXiv:2509.24693
4
citations
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Zhuoyi Yang, Jiayan Teng, Wendi Zheng et al.
ICLR 2025oralarXiv:2408.06072
1409
citations
Diversity-oriented Deep Multi-modal Clustering
Wang Yanzheng, Xin Yang, Yujun Wang et al.
NEURIPS 2025
Learning Fine-Grained Representations through Textual Token Disentanglement in Composed Video Retrieval
Yue Wu, Zhaobo Qi, Yiling Wu et al.
ICLR 2025
7
citations
Learning to Route: Per-Sample Adaptive Routing for Multimodal Multitask Prediction
Marzieh Ajirak, Oded Bein, Ellen Bowen et al.
NEURIPS 2025arXiv:2509.12227
Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning
Yichi Zhang, Zhuo Chen, Lingbing Guo et al.
ICLR 2025arXiv:2405.16869
10
citations
Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM
Zinuo Li, Xian Zhang, Yongxin Guo et al.
NEURIPS 2025oralarXiv:2505.18110
3
citations
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Ishaan Rawal, Alexander Matyasko, Shantanu Jaiswal et al.
ICML 2024arXiv:2306.08889
8
citations