"policy improvement" Papers
6 papers found
Conference
Active Reinforcement Learning Strategies for Offline Policy Improvement
Ambedkar Dukkipati, Ranga Shaarad Ayyagari, Bodhisattwa Dasgupta et al.
AAAI 2025paperarXiv:2412.13106
3
citations
Efficient Language-instructed Skill Acquisition via Reward-Policy Co-Evolution
Changxin Huang, Yanbin Chang, Junfan Lin et al.
AAAI 2025paperarXiv:2412.13492
1
citations
When Can Model-Free Reinforcement Learning be Enough for Thinking?
Josiah Hanna, Nicholas Corrado
NEURIPS 2025arXiv:2506.17124
1
citations
Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays
Qingyuan Wu, Simon Zhan, Yixuan Wang et al.
ICML 2024arXiv:2402.03141
4
citations
ODIN: Disentangled Reward Mitigates Hacking in RLHF
Lichang Chen, Chen Zhu, Jiuhai Chen et al.
ICML 2024arXiv:2402.07319
110
citations
Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills
Kolby Nottingham, Bodhisattwa Prasad Majumder, Bhavana Dalvi et al.
ICML 2024arXiv:2402.03244
12
citations