"reward overoptimization" Papers
2 papers found
Conference
Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases
Ziyi Zhang, Sen Zhang, Yibing Zhan et al.
ICML 2024oralarXiv:2402.08552
24
citations
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
Banghua Zhu, Michael Jordan, Jiantao Jiao
ICML 2024arXiv:2401.16335
48
citations