"multimodal in-context learning" Papers
6 papers found
Conference
CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Wei Chen, Lin Li, Yongqi Yang et al.
CVPR 2025highlightarXiv:2406.10462
12
citations
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Qingyun Li, Zhe Chen, Weiyun Wang et al.
ICLR 2025arXiv:2406.08418
49
citations
SMMILE: An expert-driven benchmark for multimodal medical in-context learning
Melanie Rieff, Maya Varma, Ossian Rabow et al.
NEURIPS 2025arXiv:2506.21355
3
citations
VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning
Yongshuo Zong, Ondrej Bohdal, Timothy Hospedales
ICLR 2025arXiv:2403.13164
18
citations
FineMatch: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction
Hang Hua, Jing Shi, Kushal Kafle et al.
ECCV 2024arXiv:2404.14715
14
citations
MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning
Chaoyi Zhang, Kevin Lin, Zhengyuan Yang et al.
CVPR 2024highlightarXiv:2311.17435
50
citations