Large Language Bayes

2citations

arXiv:2504.14025

citations

#1951

in NEURIPS 2025

of 5858 papers

Top Authors

Data Points

Top Authors

Justin Domke

Topics

probabilistic programming language bayesian models large language models posterior inference importance sampling mcmc methods variational inference joint distribution modeling

Abstract

Many domain experts do not have the time or expertise to write formal Bayesian models. This paper takes an informal problem description as input, and combines a large language model and a probabilistic programming language to define a joint distribution over formal models, latent variables, and data. A posterior over latent variables follows by conditioning on observed data and integrating over formal models. This presents a challenging inference problem. We suggest an inference recipe that amounts to generating many formal models from the large language model, performing approximate inference on each, and then doing a weighted average. This is justified and analyzed as a combination of self-normalized importance sampling, MCMC, and importance-weighted variational inference. Experimentally, this produces sensible predictions from only data and an informal problem description, without the need to specify a formal model.

Citation History

Jan 26, 2026

2+2

Jan 27, 2026

Feb 3, 2026

Feb 13, 2026