"multi-modal foundation models" Papers
2 papers found
Conference
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration
Qinghao Ye, Haiyang Xu, Jiabo Ye et al.
CVPR 2024highlightarXiv:2311.04257
614
citations
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models
Christian Schlarmann, Naman Singh, Francesco Croce et al.
ICML 2024arXiv:2402.12336
88
citations