α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Zhongang Qi
Zhongang Qi
20
papers
3,131
total citations
papers (20)
T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion
AAAI 2024
arXiv
1,460
citations
MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
ICCV 2023
arXiv
698
citations
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
CVPR 2024
arXiv
327
citations
LayoutDiffusion: Controllable Diffusion Model for Layout-to-Image Generation
CVPR 2023
arXiv
296
citations
Taming Rectified Flow for Inversion and Editing
ICML 2025
arXiv
119
citations
Open-Book Video Captioning With Retrieve-Copy-Generate Network
CVPR 2021
arXiv
113
citations
CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
AAAI 2025
arXiv
55
citations
Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution
NEURIPS 2021
arXiv
43
citations
EA-VTR: Event-Aware Video-Text Retrieval
ECCV 2024
arXiv
7
citations
How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval?
CVPR 2024
arXiv
4
citations
UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
NEURIPS 2025
arXiv
4
citations
Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion
CVPR 2025
arXiv
3
citations
DOGR: Towards Versatile Visual Document Grounding and Referring
ICCV 2025
arXiv
2
citations
BTS: A Bi-Lingual Benchmark for Text Segmentation in the Wild
CVPR 2022
0
citations
ViLEM: Visual-Language Error Modeling for Image-Text Retrieval
CVPR 2023
0
citations
Exploiting Contextual Objects and Relations for 3D Visual Grounding
NEURIPS 2023
0
citations
Order-Prompted Tag Sequence Generation for Video Tagging
ICCV 2023
0
citations
Less is More: Empowering GUI Agent with Context-Aware Simplification
ICCV 2025
arXiv
0
citations
VisionMath: Vision-Form Mathematical Problem-Solving
ICCV 2025
0
citations
Mamba-3VL: Taming State Space Model for 3D Vision Language Learning
ICCV 2025
0
citations