"vision-language pretraining" Papers
9 papers found
Conference
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Wenqi Zhang, Hang Zhang, Xin Li et al.
ICCV 2025highlightarXiv:2501.00958
5
citations
DM-Adapter: Domain-Aware Mixture-of-Adapters for Text-Based Person Retrieval
Yating Liu, Zimo Liu, Xiangyuan Lan et al.
AAAI 2025paperarXiv:2503.04144
10
citations
Enhancing Fine-Grained Vision-Language Pretraining with Negative Augmented Samples
Yeyuan Wang, Dehong Gao, Lei Yi et al.
AAAI 2025paperarXiv:2412.10029
4
citations
Filter Like You Test: Data-Driven Data Filtering for CLIP Pretraining
Mikey Shechter, Yair Carmon
NEURIPS 2025arXiv:2503.08805
2
citations
OphCLIP: Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining
Ming Hu, Kun yuan, Yaling Shen et al.
ICCV 2025arXiv:2411.15421
25
citations
Provable Ordering and Continuity in Vision-Language Pretraining for Generalizable Embodied Agents
Zhizhen Zhang, Lei Zhu, Zhen Fang et al.
NEURIPS 2025oralarXiv:2502.01218
2
citations
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining
Yue Li, Qi Ma, Runyi Yang et al.
ICCV 2025arXiv:2503.18052
21
citations
SyncMask: Synchronized Attentional Masking for Fashion-centric Vision-Language Pretraining
Chull Hwan Song, Taebaek Hwang, Jooyoung Yoon et al.
CVPR 2024arXiv:2404.01156
11
citations
Zero-shot Object Counting with Good Exemplars
Huilin Zhu, Jingling Yuan, Zhengwei Yang et al.
ECCV 2024arXiv:2407.04948
28
citations