α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Jianhua Han
Jianhua Han
23
papers
1,437
total citations
papers (23)
DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection
NEURIPS 2022
arXiv
223
citations
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
ICLR 2025
arXiv
174
citations
CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving
ECCV 2022
arXiv
135
citations
Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving
ECCV 2024
arXiv
115
citations
DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-Training via Word-Region Alignment
CVPR 2023
arXiv
104
citations
Open-World Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding
ECCV 2022
arXiv
89
citations
ONCE-3DLanes: Building Monocular 3D Lane Detection
CVPR 2022
arXiv
87
citations
Holistic Autonomous Driving Understanding by Bird’s-Eye-View Injected Multi-Modal Large Models
CVPR 2024
arXiv
86
citations
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
CVPR 2024
arXiv
48
citations
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
CVPR 2025
arXiv
48
citations
Effective Adaptation in Multi-Task Co-Training for Unified Autonomous Driving
NEURIPS 2022
arXiv
47
citations
Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis
ICLR 2024
arXiv
46
citations
CapDet: Unifying Dense Captioning and Open-World Detection Pretraining
CVPR 2023
arXiv
45
citations
ILLUME: Illuminating Your LLMs to See, Draw, and Self-Enhance
ICCV 2025
arXiv
44
citations
Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images
AAAI 2024
arXiv
42
citations
Generative Negative Text Replay for Continual Vision-Language Pretraining
ECCV 2022
arXiv
25
citations
Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving
CVPR 2023
arXiv
20
citations
Implicit Concept Removal of Diffusion Models
ECCV 2024
arXiv
18
citations
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
CVPR 2025
arXiv
15
citations
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
ECCV 2024
arXiv
15
citations
DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability
ICCV 2023
arXiv
8
citations
GrowCLIP: Data-Aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-Training
ICCV 2023
arXiv
3
citations
CLIP2: Contrastive Language-Image-Point Pretraining From Real-World Point Cloud Data
CVPR 2023
0
citations