Poster "visual feature extraction" Papers
3 papers found
Conference
DynImg: Key Frames with Visual Prompts are Good Representation for Multi-Modal Video Understanding
Xiaoyi Bao, Chen-Wei Xie, Hao Tang et al.
ICCV 2025arXiv:2507.15569
1
citations
Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching
Bin Wang, Fan Wu, Linke Ouyang et al.
CVPR 2025arXiv:2409.03643
13
citations
On the Out-Of-Distribution Generalization of Large Multimodal Models
Xingxuan Zhang, Jiansheng Li, Wenjing Chu et al.
CVPR 2025
4
citations