"preference learning" Papers

52 papers found • Page 1 of 2

Advancing LLM Reasoning Generalists with Preference Trees

Lifan Yuan, Ganqu Cui, Hanbin Wang et al.

ICLR 2025arXiv:2404.02078
183
citations

Align-DA: Align Score-based Atmospheric Data Assimilation with Multiple Preferences

Jing-An Sun, Hang Fan, Junchao Gong et al.

NEURIPS 2025arXiv:2505.22008
2
citations

Aligning Text-to-Image Diffusion Models to Human Preference by Classification

Longquan Dai, Xiaolu Wei, wang he et al.

NEURIPS 2025spotlight

Bandit Learning in Matching Markets with Indifference

Fang Kong, Jingqi Tang, Mingzhu Li et al.

ICLR 2025
1
citations

Bayesian Optimization with Preference Exploration using a Monotonic Neural Network Ensemble

Hanyang Wang, Juergen Branke, Matthias Poloczek

NEURIPS 2025arXiv:2501.18792

DeepHalo: A Neural Choice Model with Controllable Context Effects

Shuhan Zhang, Zhi Wang, Rui Gao et al.

NEURIPS 2025oralarXiv:2601.04616

Deep Value Benchmark: Measuring Whether Models Generalize Deep values or Shallow Preferences

Joshua Ashkinaze, Hua Shen, Saipranav Avula et al.

NEURIPS 2025oralarXiv:2511.02109
1
citations

Diverse Preference Learning for Capabilities and Alignment

Stewart Slocum, Asher Parker-Sartori, Dylan Hadfield-Menell

ICLR 2025arXiv:2511.08594
24
citations

DSPO: Direct Score Preference Optimization for Diffusion Model Alignment

Huaisheng Zhu, Teng Xiao, Vasant Honavar

ICLR 2025
22
citations

Generative Adversarial Ranking Nets

Yinghua Yao, Yuangang Pan, Jing Li et al.

ICLR 2025
1
citations

Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections

Bo Wang, Qinyuan Cheng, Runyu Peng et al.

NEURIPS 2025arXiv:2507.00018
15
citations

Improving Large Vision and Language Models by Learning from a Panel of Peers

Jefferson Hernandez, Jing Shi, Simon Jenni et al.

ICCV 2025arXiv:2509.01610
1
citations

Learning Preferences without Interaction for Cooperative AI: A Hybrid Offline-Online Approach

Haitong Ma, Haoran Yu, Haobo Fu et al.

NEURIPS 2025

LLaVA-Critic: Learning to Evaluate Multimodal Models

Tianyi Xiong, Xiyao Wang, Dong Guo et al.

CVPR 2025arXiv:2410.02712
103
citations

LoRe: Personalizing LLMs via Low-Rank Reward Modeling

Avinandan Bose, Zhihan Xiong, Yuejie Chi et al.

COLM 2025paperarXiv:2504.14439
10
citations

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

Ziyu Liu, Yuhang Zang, Xiaoyi Dong et al.

ICLR 2025arXiv:2410.17637
22
citations

MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents

Junpeng Yue, Xinrun Xu, Börje F. Karlsson et al.

ICLR 2025arXiv:2410.03450
8
citations

Multimodal LLMs as Customized Reward Models for Text-to-Image Generation

Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.

ICCV 2025arXiv:2507.21391
7
citations

NaDRO: Leveraging Dual-Reward Strategies for LLMs Training on Noisy Data

Haolong Qian, Xianliang Yang, Ling Zhang et al.

NEURIPS 2025

Online-to-Offline RL for Agent Alignment

Xu Liu, Haobo Fu, Stefano V. Albrecht et al.

ICLR 2025

Predictive Preference Learning from Human Interventions

Haoyuan Cai, Zhenghao (Mark) Peng, Bolei Zhou

NEURIPS 2025spotlightarXiv:2510.01545

Preference Learning for AI Alignment: a Causal Perspective

Katarzyna Kobalczyk, Mihaela van der Schaar

ICML 2025arXiv:2506.05967
2
citations

Preference Learning with Lie Detectors can Induce Honesty or Evasion

Chris Cundy, Adam Gleave

NEURIPS 2025arXiv:2505.13787
4
citations

Preference Learning with Response Time: Robust Losses and Guarantees

Ayush Sawarni, Sahasrajit Sarmasarkar, Vasilis Syrgkanis

NEURIPS 2025oralarXiv:2505.22820
1
citations

Principled Fine-tuning of LLMs from User-Edits: A Medley of Preference, Supervision, and Reward

Dipendra Misra, Aldo Pacchiano, Ta-Chung Chi et al.

NEURIPS 2025arXiv:2601.19055

ProSec: Fortifying Code LLMs with Proactive Security Alignment

Xiangzhe Xu, Zian Su, Jinyao Guo et al.

ICML 2025arXiv:2411.12882
13
citations

ResponseRank: Data-Efficient Reward Modeling through Preference Strength Learning

Timo Kaufmann, Yannick Metz, Daniel Keim et al.

NEURIPS 2025arXiv:2512.25023

RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

Tianyu Yu, Haoye Zhang, Qiming Li et al.

CVPR 2025highlightarXiv:2405.17220
60
citations

RouteLLM: Learning to Route LLMs from Preference Data

Isaac Ong, Amjad Almahairi, Vincent Wu et al.

ICLR 2025
24
citations

Scalable Valuation of Human Feedback through Provably Robust Model Alignment

Masahiro Fujisawa, Masaki Adachi, Michael A Osborne

NEURIPS 2025arXiv:2505.17859
1
citations

Self-Boosting Large Language Models with Synthetic Preference Data

Qingxiu Dong, Li Dong, Xingxing Zhang et al.

ICLR 2025arXiv:2410.06961
32
citations

SELF-EVOLVED REWARD LEARNING FOR LLMS

Chenghua Huang, Zhizhen Fan, Lu Wang et al.

ICLR 2025arXiv:2411.00418
19
citations

Self-Refining Language Model Anonymizers via Adversarial Distillation

Kyuyoung Kim, Hyunjun Jeon, Jinwoo Shin

NEURIPS 2025arXiv:2506.01420
3
citations

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Jiale Cheng, Xiao Liu, Cunxiang Wang et al.

ICLR 2025arXiv:2412.11605
13
citations

Sparta Alignment: Collectively Aligning Multiple Language Models through Combat

Yuru Jiang, Wenxuan Ding, Shangbin Feng et al.

NEURIPS 2025arXiv:2506.04721
4
citations

Tulu 3: Pushing Frontiers in Open Language Model Post-Training

Nathan Lambert, Jacob Morrison, Valentina Pyatkin et al.

COLM 2025paperarXiv:2411.15124
494
citations

Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization

Noam Razin, Sadhika Malladi, Adithya Bhaskar et al.

ICLR 2025arXiv:2410.08847
51
citations

Variational Best-of-N Alignment

Afra Amini, Tim Vieira, Elliott Ash et al.

ICLR 2025arXiv:2407.06057
38
citations

VPO: Aligning Text-to-Video Generation Models with Prompt Optimization

Jiale Cheng, Ruiliang Lyu, Xiaotao Gu et al.

ICCV 2025arXiv:2503.20491
13
citations

Active Preference Learning for Large Language Models

William Muldrew, Peter Hayes, Mingtian Zhang et al.

ICML 2024arXiv:2402.08114
46
citations

Customizing Language Model Responses with Contrastive In-Context Learning

Xiang Gao, Kamalika Das

AAAI 2024paperarXiv:2401.17390
20
citations

Feel-Good Thompson Sampling for Contextual Dueling Bandits

Xuheng Li, Heyang Zhao, Quanquan Gu

ICML 2024arXiv:2404.06013
17
citations

Improved Bandits in Many-to-One Matching Markets with Incentive Compatibility

Fang Kong, Shuai Li

AAAI 2024paperarXiv:2401.01528
10
citations

Interactive Hyperparameter Optimization in Multi-Objective Problems via Preference Learning

Joseph Giovanelli, Alexander Tornede, Tanja Tornede et al.

AAAI 2024paperarXiv:2309.03581
8
citations

Model Alignment as Prospect Theoretic Optimization

Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff et al.

ICML 2024spotlightarXiv:2402.01306
871
citations

Multi-Objective Bayesian Optimization with Active Preference Learning

Ryota Ozaki, Kazuki Ishikawa, Youhei Kanzaki et al.

AAAI 2024paperarXiv:2311.13460
14
citations

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

Kenneth Li, Samy Jelassi, Hugh Zhang et al.

ICML 2024arXiv:2402.14688
15
citations

RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Harrison Lee, Samrat Phatale, Hassan Mansoor et al.

ICML 2024arXiv:2309.00267
527
citations

RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback

Yufei Wang, Zhanyi Sun, Jesse Zhang et al.

ICML 2024arXiv:2402.03681
119
citations

Self-Rewarding Language Models

Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho et al.

ICML 2024arXiv:2401.10020
497
citations
PreviousNext