Poster "preference learning" Papers

39 papers found

Advancing LLM Reasoning Generalists with Preference Trees

Lifan Yuan, Ganqu Cui, Hanbin Wang et al.

ICLR 2025arXiv:2404.02078
183
citations

Align-DA: Align Score-based Atmospheric Data Assimilation with Multiple Preferences

Jing-An Sun, Hang Fan, Junchao Gong et al.

NEURIPS 2025arXiv:2505.22008
2
citations

Bandit Learning in Matching Markets with Indifference

Fang Kong, Jingqi Tang, Mingzhu Li et al.

ICLR 2025
1
citations

Bayesian Optimization with Preference Exploration using a Monotonic Neural Network Ensemble

Hanyang Wang, Juergen Branke, Matthias Poloczek

NEURIPS 2025arXiv:2501.18792

Diverse Preference Learning for Capabilities and Alignment

Stewart Slocum, Asher Parker-Sartori, Dylan Hadfield-Menell

ICLR 2025arXiv:2511.08594
24
citations

DSPO: Direct Score Preference Optimization for Diffusion Model Alignment

Huaisheng Zhu, Teng Xiao, Vasant Honavar

ICLR 2025
22
citations

Generative Adversarial Ranking Nets

Yinghua Yao, Yuangang Pan, Jing Li et al.

ICLR 2025
1
citations

Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections

Bo Wang, Qinyuan Cheng, Runyu Peng et al.

NEURIPS 2025arXiv:2507.00018
15
citations

Improving Large Vision and Language Models by Learning from a Panel of Peers

Jefferson Hernandez, Jing Shi, Simon Jenni et al.

ICCV 2025arXiv:2509.01610
1
citations

Learning Preferences without Interaction for Cooperative AI: A Hybrid Offline-Online Approach

Haitong Ma, Haoran Yu, Haobo Fu et al.

NEURIPS 2025

LLaVA-Critic: Learning to Evaluate Multimodal Models

Tianyi Xiong, Xiyao Wang, Dong Guo et al.

CVPR 2025arXiv:2410.02712
103
citations

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

Ziyu Liu, Yuhang Zang, Xiaoyi Dong et al.

ICLR 2025arXiv:2410.17637
22
citations

MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents

Junpeng Yue, Xinrun Xu, Börje F. Karlsson et al.

ICLR 2025arXiv:2410.03450
8
citations

Multimodal LLMs as Customized Reward Models for Text-to-Image Generation

Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.

ICCV 2025arXiv:2507.21391
7
citations

NaDRO: Leveraging Dual-Reward Strategies for LLMs Training on Noisy Data

Haolong Qian, Xianliang Yang, Ling Zhang et al.

NEURIPS 2025

Online-to-Offline RL for Agent Alignment

Xu Liu, Haobo Fu, Stefano V. Albrecht et al.

ICLR 2025

Preference Learning for AI Alignment: a Causal Perspective

Katarzyna Kobalczyk, Mihaela van der Schaar

ICML 2025arXiv:2506.05967
2
citations

Preference Learning with Lie Detectors can Induce Honesty or Evasion

Chris Cundy, Adam Gleave

NEURIPS 2025arXiv:2505.13787
4
citations

Principled Fine-tuning of LLMs from User-Edits: A Medley of Preference, Supervision, and Reward

Dipendra Misra, Aldo Pacchiano, Ta-Chung Chi et al.

NEURIPS 2025arXiv:2601.19055

ProSec: Fortifying Code LLMs with Proactive Security Alignment

Xiangzhe Xu, Zian Su, Jinyao Guo et al.

ICML 2025arXiv:2411.12882
13
citations

ResponseRank: Data-Efficient Reward Modeling through Preference Strength Learning

Timo Kaufmann, Yannick Metz, Daniel Keim et al.

NEURIPS 2025arXiv:2512.25023

RouteLLM: Learning to Route LLMs from Preference Data

Isaac Ong, Amjad Almahairi, Vincent Wu et al.

ICLR 2025
24
citations

Scalable Valuation of Human Feedback through Provably Robust Model Alignment

Masahiro Fujisawa, Masaki Adachi, Michael A Osborne

NEURIPS 2025arXiv:2505.17859
1
citations

Self-Boosting Large Language Models with Synthetic Preference Data

Qingxiu Dong, Li Dong, Xingxing Zhang et al.

ICLR 2025arXiv:2410.06961
32
citations

SELF-EVOLVED REWARD LEARNING FOR LLMS

Chenghua Huang, Zhizhen Fan, Lu Wang et al.

ICLR 2025arXiv:2411.00418
19
citations

Self-Refining Language Model Anonymizers via Adversarial Distillation

Kyuyoung Kim, Hyunjun Jeon, Jinwoo Shin

NEURIPS 2025arXiv:2506.01420
3
citations

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Jiale Cheng, Xiao Liu, Cunxiang Wang et al.

ICLR 2025arXiv:2412.11605
13
citations

Sparta Alignment: Collectively Aligning Multiple Language Models through Combat

Yuru Jiang, Wenxuan Ding, Shangbin Feng et al.

NEURIPS 2025arXiv:2506.04721
4
citations

Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization

Noam Razin, Sadhika Malladi, Adithya Bhaskar et al.

ICLR 2025arXiv:2410.08847
51
citations

Variational Best-of-N Alignment

Afra Amini, Tim Vieira, Elliott Ash et al.

ICLR 2025arXiv:2407.06057
38
citations

VPO: Aligning Text-to-Video Generation Models with Prompt Optimization

Jiale Cheng, Ruiliang Lyu, Xiaotao Gu et al.

ICCV 2025arXiv:2503.20491
13
citations

Active Preference Learning for Large Language Models

William Muldrew, Peter Hayes, Mingtian Zhang et al.

ICML 2024arXiv:2402.08114
46
citations

Feel-Good Thompson Sampling for Contextual Dueling Bandits

Xuheng Li, Heyang Zhao, Quanquan Gu

ICML 2024arXiv:2404.06013
17
citations

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

Kenneth Li, Samy Jelassi, Hugh Zhang et al.

ICML 2024arXiv:2402.14688
15
citations

RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Harrison Lee, Samrat Phatale, Hassan Mansoor et al.

ICML 2024arXiv:2309.00267
527
citations

RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback

Yufei Wang, Zhanyi Sun, Jesse Zhang et al.

ICML 2024arXiv:2402.03681
119
citations

Self-Rewarding Language Models

Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho et al.

ICML 2024arXiv:2401.10020
497
citations

Transforming and Combining Rewards for Aligning Large Language Models

Zihao Wang, Chirag Nagpal, Jonathan Berant et al.

ICML 2024arXiv:2402.00742
26
citations

ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback

Ganqu Cui, Lifan Yuan, Ning Ding et al.

ICML 2024arXiv:2310.01377
214
citations