"multimodal pretraining" Papers
9 papers found
Conference
Active Data Curation Effectively Distills Large-Scale Multimodal Models
Vishaal Udandarao, Nikhil Parthasarathy, Muhammad Ferjad Naeem et al.
CVPR 2025arXiv:2411.18674
15
citations
Do LVLMs Truly Understand Video Anomalies? Revealing Hallucination via Co-Occurrence Patterns
Menghao Zhang, Huazheng Wang, Pengfei Ren et al.
NEURIPS 2025
EgoM2P: Egocentric Multimodal Multitask Pretraining
Gen Li, Yutong Chen, Yiqian Wu et al.
ICCV 2025arXiv:2506.07886
5
citations
G2PDiffusion: Cross-species Genotype-to-Phenotype Prediction via Evolutionary Diffusion
Mengdi Liu, Zhangyang Gao, Hong Chang et al.
ICCV 2025arXiv:2502.04684
1
citations
LDP: Generalizing to Multilingual Visual Information Extraction by Language Decoupled Pretraining
Huawen Shen, Gengluo Li, Jinwen Zhong et al.
AAAI 2025paperarXiv:2412.14596
6
citations
LISAt: Language-Instructed Segmentation Assistant for Satellite Imagery
Jerome Quenum, Wen-Han Hsieh, Tsung-Han (Patrick) Wu et al.
NEURIPS 2025arXiv:2505.02829
4
citations
VTDexManip: A Dataset and Benchmark for Visual-tactile Pretraining and Dexterous Manipulation with Reinforcement Learning
Qingtao Liu, Yu Cui, Zhengnan Sun et al.
ICLR 2025
11
citations
DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning
Jianxiong Li, Jinliang Zheng, Yinan Zheng et al.
ICML 2024oralarXiv:2402.18137
14
citations
Multistain Pretraining for Slide Representation Learning in Pathology
Guillaume Jaume, Anurag J Vaidya, Andrew Zhang et al.
ECCV 2024arXiv:2408.02859
27
citations