Poster "preference alignment" Papers
14 papers found
Conference
Advantage-Guided Distillation for Preference Alignment in Small Language Models
Shiping Gao, Fanqi Wan, Jiajian Guo et al.
ICLR 2025arXiv:2502.17927
4
citations
Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
Zhaowei Zhang, Fengshuo Bai, Qizhi Chen et al.
ICLR 2025arXiv:2502.19148
20
citations
Direct Post-Training Preference Alignment for Multi-Agent Motion Generation Model Using Implicit Feedback from Pre-training Demonstrations
Thomas Tian, Kratarth Goel
ICLR 2025arXiv:2503.20105
4
citations
EvolvingGrasp: Evolutionary Grasp Generation via Efficient Preference Alignment
Yufei Zhu, Yiming Zhong, Zemin Yang et al.
ICCV 2025arXiv:2503.14329
2
citations
Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment
Pengfei Zhao, Rongbo Luan, Wei Zhang et al.
NEURIPS 2025arXiv:2506.06970
1
citations
More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness
Aaron J. Li, Satyapriya Krishna, Hima Lakkaraju
ICLR 2025arXiv:2404.18870
10
citations
Multi-domain Distribution Learning for De Novo Drug Design
Arne Schneuing, Ilia Igashov, Adrian Dobbelstein et al.
ICLR 2025arXiv:2508.17815
11
citations
No Preference Left Behind: Group Distributional Preference Optimization
Binwei Yao, Zefan Cai, Yun-Shiuan Chuang et al.
ICLR 2025arXiv:2412.20299
18
citations
PersonalLLM: Tailoring LLMs to Individual Preferences
Thomas Zollo, Andrew Siah, Naimeng Ye et al.
ICLR 2025arXiv:2409.20296
28
citations
Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM
Yatai Ji, Jiacheng Zhang, Jie Wu et al.
ICCV 2025arXiv:2412.15156
10
citations
Uncertainty-aware Preference Alignment for Diffusion Policies
Runqing Miao, Sheng Xu, Runyi Zhao et al.
NEURIPS 2025
VideoDPO: Omni-Preference Alignment for Video Diffusion Generation
Runtao Liu, Haoyu Wu, Zheng Ziqiang et al.
CVPR 2025arXiv:2412.14167
75
citations
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
songyang gao, Qiming Ge, Wei Shen et al.
ICML 2024arXiv:2401.11458
21
citations
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning
Hao Zhao, Maksym Andriushchenko, Francesco Croce et al.
ICML 2024arXiv:2402.04833
88
citations