α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Pan Zhang
Pan Zhang
24
papers
2,082
total citations
papers (24)
Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation
CVPR 2021
arXiv
564
citations
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
CVPR 2024
arXiv
385
citations
Cross-Domain Correspondence Learning for Exemplar-Based Image Translation
CVPR 2020
arXiv
263
citations
Bringing Old Photos Back to Life
CVPR 2020
arXiv
238
citations
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
CVPR 2024
arXiv
170
citations
V3Det: Vast Vocabulary Visual Detection Dataset
ICCV 2023
arXiv
81
citations
MetaPortrait: Identity-Preserving Talking Head Generation With Fast Personalized Adaptation
CVPR 2023
arXiv
80
citations
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree
ICCV 2025
arXiv
56
citations
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
CVPR 2025
arXiv
40
citations
Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
CVPR 2025
arXiv
36
citations
FreeDrag: Feature Dragging for Reliable Point-based Image Editing
CVPR 2024
arXiv
30
citations
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation
ICML 2025
arXiv
23
citations
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
ICLR 2025
arXiv
22
citations
MM-IFEngine: Towards Multimodal Instruction Following
ICCV 2025
arXiv
22
citations
Light-A-Video: Training-free Video Relighting via Progressive Light Fusion
ICCV 2025
arXiv
21
citations
BUOL: A Bottom-Up Framework With Occupancy-Aware Lifting for Panoptic 3D Scene Reconstruction From a Single Image
CVPR 2023
arXiv
17
citations
Real-Time Neural Character Rendering with Pose-Guided Multiplane Images
ECCV 2022
arXiv
15
citations
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
NEURIPS 2025
arXiv
8
citations
CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation
CVPR 2021
arXiv
7
citations
Bootstrap3D: Improving Multi-view Diffusion Model with Synthetic Data
ICCV 2025
arXiv
2
citations
ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way
CVPR 2025
arXiv
2
citations
Conical Visual Concentration for Efficient Large Vision-Language Models
CVPR 2025
0
citations
X-Prompt: Generalizable Auto-Regressive Visual Learning with In-Context Prompting
ICCV 2025
0
citations
Deciphering Cross-Modal Alignment in Large Vision-Language Models via Modality Integration Rate
ICCV 2025
0
citations