α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Pengchuan Zhang
Pengchuan Zhang
20
papers
6,872
total citations
papers (20)
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
ECCV 2020
arXiv
2,159
citations
Grounded Language-Image Pre-Training
CVPR 2022
arXiv
1,431
citations
RegionCLIP: Region-Based Language-Image Pretraining
CVPR 2022
arXiv
781
citations
An Empirical Study of Training End-to-End Vision-and-Language Transformers
CVPR 2022
arXiv
439
citations
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding
ICCV 2021
arXiv
374
citations
GLIPv2: Unifying Localization and Vision-Language Understanding
NEURIPS 2022
arXiv
357
citations
Unified Contrastive Learning in Image-Text-Label Space
CVPR 2022
arXiv
276
citations
UniVTG: Towards Unified Video-Language Temporal Grounding
ICCV 2023
arXiv
195
citations
ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
NEURIPS 2022
arXiv
178
citations
VinVL: Revisiting Visual Representations in Vision-Language Models
CVPR 2021
arXiv
169
citations
Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
NEURIPS 2022
arXiv
153
citations
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone
ICCV 2023
arXiv
138
citations
K-LITE: Learning Transferable Visual Models with External Knowledge
NEURIPS 2022
arXiv
96
citations
3DB: A Framework for Debugging Computer Vision Models
NEURIPS 2022
arXiv
44
citations
Revisiting the Role of Language Priors in Vision-Language Models
ICML 2024
arXiv
39
citations
Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
CVPR 2023
arXiv
28
citations
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation
CVPR 2024
arXiv
15
citations
DIME-FM : DIstilling Multimodal and Efficient Foundation Models
ICCV 2023
0
citations
Dynamic DETR: End-to-End Object Detection With Dynamic Attention
ICCV 2021
0
citations
Focal Attention for Long-Range Interactions in Vision Transformers
NEURIPS 2021
0
citations