Poster "reward optimization" Papers
4 papers found
Conference
Alignment of Large Language Models with Constrained Learning
Botong Zhang, Shuo Li, Ignacio Hounie et al.
NEURIPS 2025arXiv:2505.19387
2
citations
Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design
Chenyu Wang, Masatoshi Uehara, Yichun He et al.
ICLR 2025arXiv:2410.13643
45
citations
Reducing the Probability of Undesirable Outputs in Language Models Using Probabilistic Inference
Stephen Zhao, Aidan Li, Rob Brekelmans et al.
NEURIPS 2025arXiv:2510.21184
GFlowNet Training by Policy Gradients
Puhua Niu, Shili Wu, Mingzhou Fan et al.
ICML 2024arXiv:2408.05885
3
citations