Poster "direct policy optimization" Papers
2 papers found
Conference
Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching
Arnav Kumar Jain, Harley Wiltzer, Jesse Farebrother et al.
ICLR 2025arXiv:2411.07007
6
citations
Human Alignment of Large Language Models through Online Preference Optimisation
Daniele Calandriello, Zhaohan Guo, REMI MUNOS et al.
ICML 2024arXiv:2403.08635
88
citations