α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Cong Wei
Cong Wei
9
papers
2,019
total citations
papers (9)
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
CVPR 2024
arXiv
1,715
citations
UniIR: Training and Benchmarking Universal Multimodal Information Retrievers
ECCV 2024
arXiv
139
citations
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
ICLR 2025
arXiv
91
citations
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers
ICCV 2025
arXiv
22
citations
Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers
CVPR 2023
arXiv
22
citations
InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models
ICCV 2025
arXiv
13
citations
VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation
CVPR 2025
arXiv
11
citations
HyperSeg: Hybrid Segmentation Assistant with Fine-grained Visual Perceiver
CVPR 2025
4
citations
Advancing Visual Large Language Model for Multi-granular Versatile Perception
ICCV 2025
arXiv
2
citations