Poster "reinforce algorithm" Papers
4 papers found
Conference
Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards
Charles Arnal, Gaëtan Narozniak, Vivien Cabannes et al.
NEURIPS 2025arXiv:2506.20520
17
citations
HelpSteer2-Preference: Complementing Ratings with Preferences
Zhilin Wang, Alexander Bukharin, Olivier Delalleau et al.
ICLR 2025arXiv:2410.01257
112
citations
Finding Visual Task Vectors
Alberto Hojel, Yutong Bai, Trevor Darrell et al.
ECCV 2024arXiv:2404.05729
14
citations
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Ziniu Li, Tian Xu, Yushun Zhang et al.
ICML 2024arXiv:2310.10505
147
citations