α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Chaoyou Fu
Chaoyou Fu
Google Scholar
OpenReview
17
h-index
19
papers
2,968
total citations
papers (19)
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
NEURIPS 2025
arXiv
1,277
citations
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
CVPR 2025
arXiv
917
citations
CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification
ICCV 2021
arXiv
141
citations
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
NEURIPS 2025
arXiv
138
citations
Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
ICML 2025
arXiv
112
citations
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
ICML 2025
arXiv
94
citations
Aligning and Prompting Everything All at Once for Universal Visual Perception
CVPR 2024
arXiv
69
citations
Cross-Spectral Face Hallucination via Disentangling Independent Factors
CVPR 2020
arXiv
59
citations
Multi-modal Queried Object Detection in the Wild
NEURIPS 2023
arXiv
49
citations
AOT: Appearance Optimal Transport Based Identity Swapping for Forgery Detection
NEURIPS 2020
arXiv
32
citations
No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation
CVPR 2024
arXiv
27
citations
VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model
NEURIPS 2025
17
citations
InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption
CVPR 2025
arXiv
14
citations
Pareidolia Face Reenactment
CVPR 2021
arXiv
12
citations
Learning Interleaved Image-Text Comprehension in Vision-Language Large Models
ICLR 2025
arXiv
5
citations
CAPro: Webly Supervised Learning with Cross-modality Aligned Prototypes
NEURIPS 2023
arXiv
3
citations
Zooming from Context to Cue: Hierarchical Preference Optimization for Multi-Image MLLMs
NEURIPS 2025
arXiv
2
citations
Information Bottleneck Disentanglement for Identity Swapping
CVPR 2021
0
citations
Rethinking Image Cropping: Exploring Diverse Compositions From Global Views
CVPR 2022
0
citations