"reward transformation" Papers
4 papers found
Conference
Efficient Last-Iterate Convergence in Solving Extensive-Form Games
Linjian Meng, Tianpei Yang, Youzhi Zhang et al.
NEURIPS 2025
Pass@K Policy Optimization: Solving Harder Reinforcement Learning Problems
Christian Walder, Deep Tejas Karkhanis
NEURIPS 2025spotlightarXiv:2505.15201
28
citations
Uncertainty-Sensitive Privileged Learning
Fan-Ming Luo, Lei Yuan, Yang Yu
NEURIPS 2025oral
Transforming and Combining Rewards for Aligning Large Language Models
Zihao Wang, Chirag Nagpal, Jonathan Berant et al.
ICML 2024arXiv:2402.00742
26
citations