"multi-modal benchmarks" Papers
3 papers found
Conference
Enhancing Vision-Language Model with Unmasked Token Alignment
Hongsheng Li, Jihao Liu, Boxiao Liu et al.
ICLR 2025arXiv:2405.19009
Multi-modal Learning: A Look Back and the Road Ahead
Divyam Madaan, Sumit Chopra, Kyunghyun Cho
ICLR 2025
BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation using RGB Frames and Events
Yijin Li, Yichen Shen, Zhaoyang Huang et al.
ECCV 2024arXiv:2410.20451
8
citations