"policy regularization" Papers
3 papers found
Conference
Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking
Cassidy Laidlaw, Shivam Singhal, Anca Dragan
ICLR 2025arXiv:2403.03185
25
citations
Mitigating Reward Over-optimization in Direct Alignment Algorithms with Importance Sampling
Nguyen Phuc, Ngoc-Hieu Nguyen, Duy M. H. Nguyen et al.
NEURIPS 2025arXiv:2506.08681
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
Tenglong Liu, Yang Li, Yixing Lan et al.
ICML 2024arXiv:2405.19909
13
citations