Poster "reward maximization" Papers
3 papers found
Conference
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
Guanghe Li, Yixiang Shan, Zhengbang Zhu et al.
ICML 2024arXiv:2402.02439
36
citations
Feedback Efficient Online Fine-Tuning of Diffusion Models
Masatoshi Uehara, Yulai Zhao, Kevin Black et al.
ICML 2024arXiv:2402.16359
44
citations
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
Kenneth Li, Samy Jelassi, Hugh Zhang et al.
ICML 2024arXiv:2402.14688
15
citations