MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance

161
citations
#9
in ICML 2025
of 3340 papers
7
Top Authors
4
Data Points

Abstract

In recent years, while generative AI has advanced significantly in image generation, video generation continues to face challenges in controllability, length, and detail quality, which hinder its application. We present MimicMotion, a framework for generating high-quality human videos of arbitrary length using motion guidance. Our approach has several highlights. Firstly, we introduce confidence-aware pose guidance that ensures high frame quality and temporal smoothness. Secondly, we introduce regional loss amplification based on pose confidence, which reduces image distortion in key regions. Lastly, we propose a progressive latent fusion strategy to generate long and smooth videos. Experiments demonstrate the effectiveness of our approach in producing high-quality human motion videos. Videos and comparisons are available athttps://tencent.github.io/MimicMotion.

Citation History

Jan 28, 2026
0
Feb 13, 2026
161+161
Feb 13, 2026
161
Feb 13, 2026
161