Generative Marginalization Models

2citations

arXiv:2310.12920 PDF Project

citations

#1773

in ICML 2024

of 2635 papers

Top Authors

Data Points

Top Authors

Sulin Liu Peter Ramadge Ryan P. Adams

Topics

marginalization models high-dimensional discrete data energy-based training marginal probabilities inference any-order generative modeling marginalization self-consistency maximum likelihood training

Abstract

We introducemarginalization models(MAMs), a new family of generative models for high-dimensional discrete data. They offer scalable and flexible generative modeling by explicitly modeling all induced marginal distributions. Marginalization models enable fast approximation of arbitrary marginal probabilities with a single forward pass of the neural network, which overcomes a major limitation of arbitrary marginal inference models, such as any-order autoregressive models. MAMs also address the scalability bottleneck encountered in training any-order generative models for high-dimensional problems under the context ofenergy-based training, where the goal is to match the learned distribution to a given desired probability (specified by an unnormalized log-probability function such as energy or reward function). We propose scalable methods for learning the marginals, grounded in the concept of "marginalization self-consistency". We demonstrate the effectiveness of the proposed model on a variety of discrete data distributions, including images, text, physical systems, and molecules, formaximum likelihoodandenergy-based trainingsettings. MAMs achieve orders of magnitude speedup in evaluating the marginal probabilities on both settings. For energy-based training tasks, MAMs enable any-order generative modeling of high-dimensional problems beyond the scale of previous methods. Code is available at github.com/PrincetonLIPS/MaM.

Citation History

Jan 28, 2026

Feb 13, 2026

2+2

Feb 13, 2026