"multimodal perception" Papers
7 papers found
Conference
ADAM: An Embodied Causal Agent in Open-World Environments
Shu Yu, Chaochao Lu
ICLR 2025arXiv:2410.22194
10
citations
Dynamic Derivation and Elimination: Audio Visual Segmentation with Enhanced Audio Semantics
Chen Liu, Liying Yang, Peike Li et al.
CVPR 2025arXiv:2503.12840
4
citations
Layer-Wise Modality Decomposition for Interpretable Multimodal Sensor Fusion
Jaehyun Park, Konyul Park, Daehun Kim et al.
NEURIPS 2025arXiv:2511.00859
Massive Sound Embedding Benchmark (MSEB)
Georg Heigold, Ehsan Variani, Tom Bagby et al.
NEURIPS 2025
Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes
Yaoting Wang, Peiwen Sun, Dongzhan Zhou et al.
ECCV 2024arXiv:2407.10957
25
citations
RoboMP$^2$: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language Models
Qi Lv, Hao Li, Xiang Deng et al.
ICML 2024arXiv:2404.04929
4
citations
When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset
Yi Zhang, Wang Zeng, Sheng Jin et al.
ECCV 2024arXiv:2407.10125
21
citations