α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Chunyuan Li
Chunyuan Li
27
papers
21,273
total citations
papers (27)
Visual Instruction Tuning
NEURIPS 2023
arXiv
7,821
citations
Improved Baselines with Visual Instruction Tuning
CVPR 2024
arXiv
4,359
citations
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
ECCV 2020
arXiv
2,159
citations
Grounded Language-Image Pre-Training
CVPR 2022
arXiv
1,431
citations
LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
NEURIPS 2023
arXiv
1,391
citations
GLIGEN: Open-Set Grounded Text-to-Image Generation
CVPR 2023
arXiv
816
citations
RegionCLIP: Region-Based Language-Image Pretraining
CVPR 2022
arXiv
781
citations
Focal Modulation Networks
NEURIPS 2022
arXiv
394
citations
Generalized Decoding for Pixel, Image, and Language
CVPR 2023
arXiv
336
citations
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-Training
CVPR 2020
arXiv
326
citations
Unified Contrastive Learning in Image-Text-Label Space
CVPR 2022
arXiv
276
citations
A Simple Framework for Open-Vocabulary Segmentation and Detection
ICCV 2023
arXiv
216
citations
Towards Language-Free Training for Text-to-Image Generation
CVPR 2022
arXiv
209
citations
ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
NEURIPS 2022
arXiv
178
citations
LLaVA-Critic: Learning to Evaluate Multimodal Models
CVPR 2025
arXiv
103
citations
K-LITE: Learning Transferable Visual Models with External Knowledge
NEURIPS 2022
arXiv
96
citations
Large Language Models are Visual Reasoning Coordinators
NEURIPS 2023
arXiv
95
citations
Learning Customized Visual Models With Retrieval-Augmented Knowledge
CVPR 2023
arXiv
79
citations
Visual In-Context Prompting
CVPR 2024
arXiv
54
citations
Structure-Aware Human-Action Generation
ECCV 2020
arXiv
45
citations
Exploring Robustness of Unsupervised Domain Adaptation in Semantic Segmentation
ICCV 2021
arXiv
33
citations
Graphic Design with Large Multimodal Model
AAAI 2025
arXiv
29
citations
Partition-Guided GANs
CVPR 2021
arXiv
22
citations
Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning
ICLR 2025
arXiv
17
citations
MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines
ICLR 2025
7
citations
Focal Attention for Long-Range Interactions in Vision Transformers
NEURIPS 2021
0
citations
Position: TrustLLM: Trustworthiness in Large Language Models
ICML 2024
0
citations