"reward model training" Papers
10 papers found
Conference
An Evaluation Framework for Product Images Background Inpainting Based on Human Feedback and Product Consistency
Yuqi Liang, Jun Luo, Xiaoxi Guo et al.
AAAI 2025paperarXiv:2412.17504
1
citations
HelpSteer2-Preference: Complementing Ratings with Preferences
Zhilin Wang, Alexander Bukharin, Olivier Delalleau et al.
ICLR 2025arXiv:2410.01257
112
citations
RRM: Robust Reward Model Training Mitigates Reward Hacking
Tianqi Liu, Wei Xiong, Jie Ren et al.
ICLR 2025arXiv:2409.13156
50
citations
SELF-EVOLVED REWARD LEARNING FOR LLMS
Chenghua Huang, Zhizhen Fan, Lu Wang et al.
ICLR 2025arXiv:2411.00418
19
citations
Weak to Strong Generalization for Large Language Models with Multi-capabilities
Yucheng Zhou, Jianbing Shen, Yu Cheng
ICLR 2025
70
citations
Explore 3D Dance Generation via Reward Model from Automatically-Ranked Demonstrations
Zilin Wang, Haolin Zhuang, Lu Li et al.
AAAI 2024paperarXiv:2312.11442
5
citations
RewriteLM: An Instruction-Tuned Large Language Model for Text Rewriting
Lei Shu, Liangchen Luo, Jayakumar Hoskere et al.
AAAI 2024paperarXiv:2305.15685
78
citations
Rich Human Feedback for Text-to-Image Generation
Youwei Liang, Junfeng He, Gang Li et al.
CVPR 2024arXiv:2312.10240
134
citations
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Harrison Lee, Samrat Phatale, Hassan Mansoor et al.
ICML 2024arXiv:2309.00267
527
citations
Self-Rewarding Language Models
Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho et al.
ICML 2024arXiv:2401.10020
497
citations