A solvable model of learning generative diffusion: theory and insights

5citations

arXiv:2501.03937

citations

#1136

in NEURIPS 2025

of 5858 papers

Top Authors

Data Points

Top Authors

Hugo Cui Cengiz Pehlevan Yue Lu

Topics

diffusion models generative modeling online stochastic gradient descent two-layer auto-encoder low-dimensional manifold structure mode collapse model collapse synthetic data generation

Abstract

In this manuscript, we consider the problem of learning a flow or diffusion-based generative model parametrized by a two-layer auto-encoder, trained with online stochastic gradient descent, on a high-dimensional target density with an underlying low-dimensional manifold structure. We derive a tight asymptotic characterization of low-dimensional projections of the distribution of samples generated by the learned model, ascertaining in particular its dependence on the number of training samples. Building on this analysis, we discuss how mode collapse can arise, and lead to model collapse when the generative model is re-trained on generated synthetic data.

Citation History

Jan 26, 2026

Feb 2, 2026

Feb 13, 2026