α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Xiyang Dai
Xiyang Dai
27
papers
8,917
total citations
papers (27)
CvT: Introducing Convolutions to Vision Transformers
ICCV 2021
arXiv
2,302
citations
Dynamic Convolution: Attention Over Convolution Kernels
CVPR 2020
arXiv
1,196
citations
Dynamic Head: Unifying Object Detection Heads With Attentions
CVPR 2021
arXiv
811
citations
RegionCLIP: Region-Based Language-Image Pretraining
CVPR 2022
arXiv
781
citations
Mobile-Former: Bridging MobileNet and Transformer
CVPR 2022
arXiv
634
citations
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
CVPR 2024
arXiv
409
citations
Focal Modulation Networks
NEURIPS 2022
arXiv
394
citations
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding
ICCV 2021
arXiv
374
citations
GLIPv2: Unifying Localization and Vision-Language Understanding
NEURIPS 2022
arXiv
357
citations
Rewrite the Stars
CVPR 2024
arXiv
352
citations
Generalized Decoding for Pixel, Image, and Language
CVPR 2023
arXiv
336
citations
BEVT: BERT Pretraining of Video Transformers
CVPR 2022
arXiv
249
citations
Dynamic ReLU
ECCV 2020
arXiv
198
citations
Masked Video Distillation: Rethinking Masked Feature Modeling for Self-Supervised Video Representation Learning
CVPR 2023
arXiv
121
citations
MicroNet: Improving Image Recognition With Extremely Low FLOPs
ICCV 2021
arXiv
104
citations
Reduce Information Loss in Transformers for Pluralistic Image Inpainting
CVPR 2022
arXiv
89
citations
Stronger NAS with Weaker Predictors
NEURIPS 2021
arXiv
55
citations
Look Before You Match: Instance Understanding Matters in Video Object Segmentation
CVPR 2023
arXiv
54
citations
Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning
NEURIPS 2022
arXiv
31
citations
Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
CVPR 2023
arXiv
28
citations
DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search
ECCV 2020
arXiv
22
citations
Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection
NEURIPS 2023
arXiv
15
citations
Should All Proposals Be Treated Equally in Object Detection?
ECCV 2022
arXiv
4
citations
Exploring Invariance in Images through One-way Wave Equations
ICML 2025
arXiv
1
citations
Dynamic DETR: End-to-End Object Detection With Dynamic Attention
ICCV 2021
0
citations
Focal Attention for Long-Range Interactions in Vision Transformers
NEURIPS 2021
0
citations
METAL: Minimum Effort Temporal Activity Localization in Untrimmed Videos
CVPR 2020
0
citations