TeRA: Rethinking Text-guided Realistic 3D Avatar Generation

1citations

arXiv:2509.02466

citations

#1381

in ICCV 2025

of 2701 papers

Top Authors

Data Points

Top Authors

Yanwen Wang Yiyu Zhuang Jiawei Zhang Li Wang Yifei Zeng Xun Cao Xinxin Zuo Hao Zhu

Abstract

In this paper, we rethink text-to-avatar generative models by proposing TeRA, a more efficient and effective framework than the previous SDS-based models and general large 3D generative models. Our approach employs a two-stage training strategy for learning a native 3D avatar generative model. Initially, we distill a decoder to derive a structured latent space from a large human reconstruction model. Subsequently, a text-controlled latent diffusion model is trained to generate photorealistic 3D human avatars within this latent space. TeRA enhances the model performance by eliminating slow iterative optimization and enables text-based partial customization through a structured 3D human representation. Experiments have proven our approach's superiority over previous text-to-avatar generative models in subjective and objective evaluation.

Citation History

Jan 24, 2026

Jan 26, 2026

Jan 28, 2026

Feb 13, 2026

1+1

Feb 13, 2026