"offline alignment" Papers
2 papers found
Conference
Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions
Simon Matrenok, Skander Moalla, Caglar Gulcehre
NEURIPS 2025arXiv:2507.08068
1
citations
Generalized Preference Optimization: A Unified Approach to Offline Alignment
Yunhao Tang, Zhaohan Guo, Zeyu Zheng et al.
ICML 2024arXiv:2402.05749
150
citations