α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Ming Yang
Ming Yang
28
papers
984
total citations
papers (28)
Track To Detect and Segment: An Online Multi-Object Tracker
CVPR 2021
arXiv
383
citations
SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery
CVPR 2024
arXiv
244
citations
Back-Tracing Representative Points for Voting-Based 3D Object Detection in Point Clouds
CVPR 2021
arXiv
111
citations
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
ICLR 2025
arXiv
64
citations
Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs
CVPR 2024
arXiv
52
citations
StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models
ECCV 2024
arXiv
23
citations
Mimir: Improving Video Diffusion Models for Precise Text Understanding
CVPR 2025
arXiv
16
citations
SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
CVPR 2025
arXiv
14
citations
MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation
CVPR 2025
arXiv
14
citations
Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis
CVPR 2024
arXiv
12
citations
DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection
ICML 2024
arXiv
12
citations
From Experts to a Generalist: Toward General Whole-Body Control for Humanoid Robots
NEURIPS 2025
arXiv
10
citations
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
ICML 2024
arXiv
7
citations
DynFocus: Dynamic Cooperative Network Empowers LLMs with Video Understanding
CVPR 2025
arXiv
6
citations
EcoMatcher: Efficient Clustering Oriented Matcher for Detector-free Image Matching
ECCV 2024
4
citations
Stability and Generalization of Stochastic Compositional Gradient Descent Algorithms
ICML 2024
arXiv
3
citations
Social Debiasing for Fair Multi-modal LLMs
ICCV 2025
arXiv
3
citations
Towards Better Vision-Inspired Vision-Language Models
CVPR 2024
2
citations
HomoMatcher: Achieving Dense Feature Matching with Semi-Dense Efficiency by Homography Estimation
AAAI 2025
2
citations
Reversing Flow for Image Restoration
CVPR 2025
arXiv
2
citations
Temporal-Context Enhanced Detection of Heavily Occluded Pedestrians
CVPR 2020
0
citations
CasP: Improving Semi-Dense Feature Matching Pipeline Leveraging Cascaded Correspondence Priors for Guidance
ICCV 2025
arXiv
0
citations
Stacked Homography Transformations for Multi-View Pedestrian Detection
ICCV 2021
0
citations
Efficient Potential-based Exploration in Reinforcement Learning using Inverse Dynamic Bisimulation Metric
NEURIPS 2023
0
citations
Engage for All: Making Ordinary Image Descriptions Appealing Again!
ICCV 2025
0
citations
SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling
CVPR 2025
0
citations
Unified Video Generation via Next-Set Prediction in Continuous Domain
ICCV 2025
0
citations
Orthogonal Non-negative Tensor Factorization based Multi-view Clustering
NEURIPS 2023
0
citations