DreamAlign: Dynamic Text-to-3D Optimization with Human Preference Alignment

3citations
PDFProject
3
citations
#1222
in AAAI 2025
of 3028 papers
3
Top Authors
2
Data Points

Abstract

Recent years have witnessed the remarkable success of Text-to-3D generation, particularly with the rise of mainstream conditional diffusion models (DMs). Though achieving substantial progress, existing methods still face a knotty "human preference" dilemma, that is the 3D contents generated by the models often deviate greatly from the desired effects (e.g., perspective, aesthetics, shading, appearance, etc.) due to the lack of attention to human preferences. To mitigate the limitation of data deficiency and enable human preference learning, we first elaborately curate the HP3D, a text-to-3D dataset with expert preference annotations which is initally captioned by the multimodal large model LLava and then refined by human expert. Based on such a brand-new HP3D, we further propose DreamAlign, a reward-free method that does not require designing any complex reward models whereas only by introducing a light-weight lora adapter and then designing a novel direct 3D preference optimization (D-3DPO) algorithm for training. Moreover, in the stage of text-to-3D we design an additional Preference Contrastive Feedback training for score distillation sampling, which enables the generated 3D objects to align the human preferences (e.g., aesthetics, material, etc.). Extensive experiments demonstrate that DreamAlign consistently achieves state-of-the-art performance on generative effects and human preference alignment across various benchmark evaluations.

Citation History

Jan 27, 2026
0
Feb 4, 2026
3+3