"human alignment" Papers
3 papers found
Conference
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Javier Rando, Tony Wang, Stewart Slocum et al.
ICLR 2025arXiv:2307.15217
750
citations
Human Alignment of Large Language Models through Online Preference Optimisation
Daniele Calandriello, Zhaohan Guo, REMI MUNOS et al.
ICML 2024arXiv:2403.08635
88
citations
Preference Ranking Optimization for Human Alignment
Feifan Song, Bowen Yu, Minghao Li et al.
AAAI 2024paperarXiv:2306.17492
337
citations