"preference modeling" Papers
12 papers found
Conference
Beyond Scalar Rewards: An Axiomatic Framework for Lexicographic MDPs
Mehran Shakerinava, Siamak Ravanbakhsh, Adam Oberman
NEURIPS 2025spotlightarXiv:2505.12049
CARTS: Advancing Neural Theorem Proving with Diversified Tactic Calibration and Bias-Resistant Tree Search
Xiao-Wen Yang, Zhi Zhou, Haiming Wang et al.
ICLR 2025
4
citations
Efficient and Near-Optimal Algorithm for Contextual Dueling Bandits with Offline Regression Oracles
Aadirupa Saha, Robert Schapire
NEURIPS 2025
Generalized Top-k Mallows Model for Ranked Choices
Shahrzad Haddadan, Sara Ahmadian
NEURIPS 2025spotlightarXiv:2510.22040
Inverse Constitutional AI: Compressing Preferences into Principles
Arduin Findeis, Timo Kaufmann, Eyke Hüllermeier et al.
ICLR 2025arXiv:2406.06560
26
citations
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Yuheng Zhang, Dian Yu, Baolin Peng et al.
ICLR 2025arXiv:2407.00617
34
citations
Logic-Logit: A Logic-Based Approach to Choice Modeling
Shuhan Zhang, Wendi Ren, Shuang Li
ICLR 2025
Pairwise Calibrated Rewards for Pluralistic Alignment
Daniel Halpern, Evi Micha, Ariel Procaccia et al.
NEURIPS 2025arXiv:2506.06298
2
citations
Red-Teaming Text-to-Image Systems by Rule-based Preference Modeling
Yichuan Cao, Yibo Miao, Xiao-Shan Gao et al.
NEURIPS 2025arXiv:2505.21074
2
citations
Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models
Ángela López-Cardona, Carlos Segura, Alexandros Karatzoglou et al.
ICLR 2025arXiv:2410.01532
8
citations
Nash Learning from Human Feedback
REMI MUNOS, Michal Valko, Daniele Calandriello et al.
ICML 2024spotlightarXiv:2312.00886
195
citations
Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input
Andi Peng, Yuying Sun, Tianmin Shu et al.
ICML 2024oralarXiv:2405.14769
5
citations