"reward model optimization" Papers
2 papers found
Conference
Inference-Time Reward Hacking in Large Language Models
Hadi Khalaf, Claudio Mayrink Verdun, Alex Oesterling et al.
NEURIPS 2025spotlightarXiv:2506.19248
3
citations
Scaling Autonomous Agents via Automatic Reward Modeling And Planning
Zhenfang Chen, Delin Chen, Rui Sun et al.
ICLR 2025arXiv:2502.12130
15
citations