Poster "iterative preference learning" Papers
2 papers found
Conference
Progress or Regress? Self-Improvement Reversal in Post-training
Ting Wu, Xuefeng Li, Pengfei Liu
ICLR 2025arXiv:2407.05013
19
citations
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint
Wei Xiong, Hanze Dong, Chenlu Ye et al.
ICML 2024arXiv:2312.11456
312
citations