Poster "kl divergence estimation" Papers
2 papers found
Conference
GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning
Shutong Ding, Ke Hu, Shan Zhong et al.
NEURIPS 2025arXiv:2505.18763
6
citations
Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo
Stephen Zhao, Rob Brekelmans, Alireza Makhzani et al.
ICML 2024arXiv:2404.17546
56
citations