SceneFactor: Factored Latent 3D Diffusion for Controllable 3D Scene Generation

14citations

arXiv:2412.01801

citations

#596

in CVPR 2025

of 2873 papers

Top Authors

Data Points

Top Authors

Aleksei Bokhovkin Quan Meng Shubham Tulsiani Angela Dai

Abstract

We present SceneFactor, a diffusion-based approach for large-scale 3D scene generation that enables controllable generation and effortless editing. SceneFactor enables text-guided 3D scene synthesis through our factored diffusion formulation, leveraging latent semantic and geometric manifolds for generation of arbitrary-sized 3D scenes. While text input enables easy, controllable generation, text guidance remains imprecise for intuitive, localized editing and manipulation of the generated 3D scenes. Our factored semantic diffusion generates a proxy semantic space composed of semantic 3D boxes that enables controllable editing of generated scenes by adding, removing, changing the size of the semantic 3D proxy boxes that guides high-fidelity, consistent 3D geometric editing. Extensive experiments demonstrate that our approach enables high-fidelity 3D scene synthesis with effective controllable editing through our factored diffusion approach.

Citation History

Jan 24, 2026

Jan 26, 2026

Jan 28, 2026

Feb 13, 2026

14+14

Feb 13, 2026