"preference feedback" Papers
4 papers found
Conference
Finally Rank-Breaking Conquers MNL Bandits: Optimal and Efficient Algorithms for MNL Assortment
Aadirupa Saha, Pierre Gaillard
ICLR 2025
1
citations
Non-Stationary Dueling Bandits Under a Weighted Borda Criterion
Joe Suk, Arpit Agarwal
ICLR 2025arXiv:2403.12950
2
citations
Reward Learning from Multiple Feedback Types
Yannick Metz, Andras Geiszl, Raphaël Baur et al.
ICLR 2025arXiv:2502.21038
5
citations
Coactive Learning for Large Language Models using Implicit User Feedback
Aaron D. Tucker, Kianté Brantley, Adam Cahall et al.
ICML 2024