α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Dit-Yan Yeung
Dit-Yan Yeung
1
Affiliations
Affiliations
Hong Kong University of Science and Technology
21
papers
1,231
total citations
papers (21)
Earthformer: Exploring Space-Time Transformers for Earth System Forecasting
NEURIPS 2022
arXiv
259
citations
MagicDrive: Street View Generation with Diverse 3D Geometry Control
ICLR 2024
arXiv
218
citations
CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving
ECCV 2022
arXiv
135
citations
RoboDreamer: Learning Compositional World Models for Robot Imagination
ICML 2024
arXiv
107
citations
Gaussian Shell Maps for Efficient 3D Human Generation
CVPR 2024
arXiv
67
citations
MultiSiam: Self-Supervised Multi-Instance Siamese Representation Learning for Autonomous Driving
ICCV 2021
arXiv
59
citations
Mixed Autoencoder for Self-Supervised Visual Representation Learning
CVPR 2023
arXiv
52
citations
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
CVPR 2025
arXiv
48
citations
Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis
ICLR 2024
arXiv
46
citations
DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception
CVPR 2024
arXiv
39
citations
3D-Aware Indoor Scene Synthesis with Depth Priors
ECCV 2022
arXiv
37
citations
Adaptive Online Replanning with Diffusion Models
NEURIPS 2023
arXiv
33
citations
Learning 3D-Aware Image Synthesis With Unknown Pose Distribution
CVPR 2023
arXiv
23
citations
Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator
NEURIPS 2022
arXiv
21
citations
Implicit Concept Removal of Diffusion Models
ECCV 2024
arXiv
18
citations
G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o
AAAI 2025
arXiv
17
citations
Learning 3D Persistent Embodied World Models
NEURIPS 2025
arXiv
17
citations
Anyattack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models
CVPR 2025
arXiv
15
citations
SVQNet: Sparse Voxel-Adjacent Query Network for 4D Spatio-Temporal LiDAR Semantic Segmentation
ICCV 2023
arXiv
15
citations
Learning High-resolution Vector Representation from Multi-Camera Images for 3D Object Detection
ECCV 2024
arXiv
5
citations
CLIP2: Contrastive Language-Image-Point Pretraining From Real-World Point Cloud Data
CVPR 2023
0
citations