"optical character recognition" Papers
5 papers found
Conference
CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy
Zhibo Yang, Jun Tang, Zhaohai Li et al.
ICCV 2025arXiv:2412.02210
43
citations
Detecting Visual Information Manipulation Attacks in Augmented Reality: A Multimodal Semantic Reasoning Approach
Yanming Xiu, Maria Gorlatova
ISMAR 2025paperarXiv:2507.20356
7
citations
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Min Shi, Fuxiao Liu, Shihao Wang et al.
ICLR 2025arXiv:2408.15998
116
citations
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Byung-Kwan Lee, Beomchan Park, Chae Won Kim et al.
ECCV 2024arXiv:2403.07508
34
citations
ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting
Chen Duan, Pei Fu, Shan Guo et al.
CVPR 2024arXiv:2403.00303
16
citations