Poster "multimodal pretraining" Papers
7 papers found
Conference
Active Data Curation Effectively Distills Large-Scale Multimodal Models
Vishaal Udandarao, Nikhil Parthasarathy, Muhammad Ferjad Naeem et al.
CVPR 2025arXiv:2411.18674
15
citations
Do LVLMs Truly Understand Video Anomalies? Revealing Hallucination via Co-Occurrence Patterns
Menghao Zhang, Huazheng Wang, Pengfei Ren et al.
NEURIPS 2025
EgoM2P: Egocentric Multimodal Multitask Pretraining
Gen Li, Yutong Chen, Yiqian Wu et al.
ICCV 2025arXiv:2506.07886
5
citations
G2PDiffusion: Cross-species Genotype-to-Phenotype Prediction via Evolutionary Diffusion
Mengdi Liu, Zhangyang Gao, Hong Chang et al.
ICCV 2025arXiv:2502.04684
1
citations
LISAt: Language-Instructed Segmentation Assistant for Satellite Imagery
Jerome Quenum, Wen-Han Hsieh, Tsung-Han (Patrick) Wu et al.
NEURIPS 2025arXiv:2505.02829
4
citations
VTDexManip: A Dataset and Benchmark for Visual-tactile Pretraining and Dexterous Manipulation with Reinforcement Learning
Qingtao Liu, Yu Cui, Zhengnan Sun et al.
ICLR 2025
11
citations
Multistain Pretraining for Slide Representation Learning in Pathology
Guillaume Jaume, Anurag J Vaidya, Andrew Zhang et al.
ECCV 2024arXiv:2408.02859
27
citations