by Junfei Wu Papers
3 papers found
Conference
MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?
YiFan Zhang, Huanyu Zhang, Haochen Tian et al.
ICLR 2025arXiv:2408.13257
147
citations
Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
Junfei Wu, Jian Guan, Kaituo Feng et al.
NEURIPS 2025arXiv:2506.09965
54
citations
Video-R1: Reinforcing Video Reasoning in MLLMs
Kaituo Feng, Kaixiong Gong, Bohao Li et al.
NEURIPS 2025oralarXiv:2503.21776
257
citations