Poster "preference alignment" Papers

14 papers found

Advantage-Guided Distillation for Preference Alignment in Small Language Models

Shiping Gao, Fanqi Wan, Jiajian Guo et al.

ICLR 2025arXiv:2502.17927
4
citations

Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs

Zhaowei Zhang, Fengshuo Bai, Qizhi Chen et al.

ICLR 2025arXiv:2502.19148
20
citations

Direct Post-Training Preference Alignment for Multi-Agent Motion Generation Model Using Implicit Feedback from Pre-training Demonstrations

Thomas Tian, Kratarth Goel

ICLR 2025arXiv:2503.20105
4
citations

EvolvingGrasp: Evolutionary Grasp Generation via Efficient Preference Alignment

Yufei Zhu, Yiming Zhong, Zemin Yang et al.

ICCV 2025arXiv:2503.14329
2
citations

Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment

Pengfei Zhao, Rongbo Luan, Wei Zhang et al.

NEURIPS 2025arXiv:2506.06970
1
citations

More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness

Aaron J. Li, Satyapriya Krishna, Hima Lakkaraju

ICLR 2025arXiv:2404.18870
10
citations

Multi-domain Distribution Learning for De Novo Drug Design

Arne Schneuing, Ilia Igashov, Adrian Dobbelstein et al.

ICLR 2025arXiv:2508.17815
11
citations

No Preference Left Behind: Group Distributional Preference Optimization

Binwei Yao, Zefan Cai, Yun-Shiuan Chuang et al.

ICLR 2025arXiv:2412.20299
18
citations

PersonalLLM: Tailoring LLMs to Individual Preferences

Thomas Zollo, Andrew Siah, Naimeng Ye et al.

ICLR 2025arXiv:2409.20296
28
citations

Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM

Yatai Ji, Jiacheng Zhang, Jie Wu et al.

ICCV 2025arXiv:2412.15156
10
citations

Uncertainty-aware Preference Alignment for Diffusion Policies

Runqing Miao, Sheng Xu, Runyi Zhao et al.

NEURIPS 2025

VideoDPO: Omni-Preference Alignment for Video Diffusion Generation

Runtao Liu, Haoyu Wu, Zheng Ziqiang et al.

CVPR 2025arXiv:2412.14167
75
citations

Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback

songyang gao, Qiming Ge, Wei Shen et al.

ICML 2024arXiv:2401.11458
21
citations

Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

Hao Zhao, Maksym Andriushchenko, Francesco Croce et al.

ICML 2024arXiv:2402.04833
88
citations