"preference-based learning" Papers
6 papers found
Conference
Approximated Variational Bayesian Inverse Reinforcement Learning for Large Language Model Alignment
Yuang Cai, Yuyu Yuan, Jinsheng Shi et al.
AAAI 2025paperarXiv:2411.09341
4
citations
On Generalization Across Environments In Multi-Objective Reinforcement Learning
Jayden Teoh, Pradeep Varakantham, Peter Vamplew
ICLR 2025arXiv:2503.00799
5
citations
Pareto Prompt Optimization
Guang Zhao, Byung-Jun Yoon, Gilchan Park et al.
ICLR 2025
1
citations
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Shusheng Xu, Wei Fu, Jiaxuan Gao et al.
ICML 2024arXiv:2404.10719
253
citations
PIPER: Primitive-Informed Preference-based Hierarchical Reinforcement Learning via Hindsight Relabeling
Utsav Singh, Wesley A. Suttle, Brian Sadler et al.
ICML 2024arXiv:2404.13423
5
citations
Rating-Based Reinforcement Learning
Devin White, Mingkang Wu, Ellen Novoseller et al.
AAAI 2024paperarXiv:2307.16348
14
citations