α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Baining Guo
Baining Guo
23
papers
37,885
total citations
papers (23)
Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows
ICCV 2021
arXiv
29,293
citations
Swin Transformer V2: Scaling Up Capacity and Resolution
CVPR 2022
arXiv
2,487
citations
CSWin Transformer: A General Vision Transformer Backbone With Cross-Shaped Windows
CVPR 2022
arXiv
1,252
citations
Face X-Ray for More General Face Forgery Detection
CVPR 2020
arXiv
1,068
citations
Vector Quantized Diffusion Model for Text-to-Image Synthesis
CVPR 2022
arXiv
963
citations
Learning Texture Transformer Network for Image Super-Resolution
CVPR 2020
arXiv
826
citations
RODIN: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion
CVPR 2023
arXiv
359
citations
StyleSwin: Transformer-Based GAN for High-Resolution Image Generation
CVPR 2022
arXiv
296
citations
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
CVPR 2023
arXiv
259
citations
Advancing High-Resolution Video-Language Representation With Large-Scale Video Transcriptions
CVPR 2022
arXiv
254
citations
Efficient Diffusion Training via Min-SNR Weighting Strategy
ICCV 2023
arXiv
228
citations
Protecting Celebrities From DeepFake With Identity Consistency Transformer
CVPR 2022
arXiv
164
citations
InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
CVPR 2024
arXiv
162
citations
CCEdit: Creative and Controllable Video Editing via Diffusion Models
CVPR 2024
arXiv
80
citations
Adaptive Frequency Filters As Efficient Global Token Mixers
ICCV 2023
arXiv
80
citations
Improved Noise Schedule for Diffusion Training
ICCV 2025
arXiv
33
citations
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
CVPR 2024
arXiv
28
citations
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation
CVPR 2025
arXiv
20
citations
UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping
CVPR 2025
arXiv
16
citations
Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis
ICCV 2025
arXiv
12
citations
VideoVLA: Video Generators Can Be Generalizable Robot Manipulators
NEURIPS 2025
arXiv
5
citations
iCLIP: Bridging Image Classification and Contrastive Language-Image Pre-Training for Visual Recognition
CVPR 2023
0
citations
Improving CLIP Fine-tuning Performance
ICCV 2023
0
citations