"preference modeling" Papers

12 papers found

Beyond Scalar Rewards: An Axiomatic Framework for Lexicographic MDPs

Mehran Shakerinava, Siamak Ravanbakhsh, Adam Oberman

NEURIPS 2025spotlightarXiv:2505.12049

CARTS: Advancing Neural Theorem Proving with Diversified Tactic Calibration and Bias-Resistant Tree Search

Xiao-Wen Yang, Zhi Zhou, Haiming Wang et al.

ICLR 2025
4
citations

Efficient and Near-Optimal Algorithm for Contextual Dueling Bandits with Offline Regression Oracles

Aadirupa Saha, Robert Schapire

NEURIPS 2025

Generalized Top-k Mallows Model for Ranked Choices

Shahrzad Haddadan, Sara Ahmadian

NEURIPS 2025spotlightarXiv:2510.22040

Inverse Constitutional AI: Compressing Preferences into Principles

Arduin Findeis, Timo Kaufmann, Eyke Hüllermeier et al.

ICLR 2025arXiv:2406.06560
26
citations

Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning

Yuheng Zhang, Dian Yu, Baolin Peng et al.

ICLR 2025arXiv:2407.00617
34
citations

Logic-Logit: A Logic-Based Approach to Choice Modeling

Shuhan Zhang, Wendi Ren, Shuang Li

ICLR 2025

Pairwise Calibrated Rewards for Pluralistic Alignment

Daniel Halpern, Evi Micha, Ariel Procaccia et al.

NEURIPS 2025arXiv:2506.06298
2
citations

Red-Teaming Text-to-Image Systems by Rule-based Preference Modeling

Yichuan Cao, Yibo Miao, Xiao-Shan Gao et al.

NEURIPS 2025arXiv:2505.21074
2
citations

Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models

Ángela López-Cardona, Carlos Segura, Alexandros Karatzoglou et al.

ICLR 2025arXiv:2410.01532
8
citations

Nash Learning from Human Feedback

REMI MUNOS, Michal Valko, Daniele Calandriello et al.

ICML 2024spotlightarXiv:2312.00886
195
citations

Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input

Andi Peng, Yuying Sun, Tianmin Shu et al.

ICML 2024oralarXiv:2405.14769
5
citations