Poster "online policy optimization" Papers
2 papers found
Conference
Maximizing the Value of Predictions in Control: Accuracy Is Not Enough
Yiheng Lin, Christopher Yeh, Zaiwei Chen et al.
NEURIPS 2025arXiv:2506.04497
Human Alignment of Large Language Models through Online Preference Optimisation
Daniele Calandriello, Zhaohan Guo, REMI MUNOS et al.
ICML 2024arXiv:2403.08635
88
citations