"human preference modeling" Papers
8 papers found
Conference
Advantage-Guided Distillation for Preference Alignment in Small Language Models
Shiping Gao, Fanqi Wan, Jiajian Guo et al.
ICLR 2025arXiv:2502.17927
4
citations
Bridging Human and LLM Judgments: Understanding and Narrowing the Gap
Felipe Maia Polo, Xinhe Wang, Mikhail Yurochkin et al.
NEURIPS 2025arXiv:2508.12792
1
citations
Direct Post-Training Preference Alignment for Multi-Agent Motion Generation Model Using Implicit Feedback from Pre-training Demonstrations
Thomas Tian, Kratarth Goel
ICLR 2025arXiv:2503.20105
4
citations
Enhancing Reward Models for High-quality Image Generation: Beyond Text-Image Alignment
ying ba, Tianyu Zhang, Yalong Bai et al.
ICCV 2025arXiv:2507.19002
6
citations
TODO: Enhancing LLM Alignment with Ternary Preferences
Yuxiang Guo, Lu Yin, Bo Jiang et al.
ICLR 2025arXiv:2411.02442
5
citations
Learning Optimal Advantage from Preferences and Mistaking It for Reward
W Bradley Knox, Stephane Hatgis-Kessell, Sigurdur Orn Adalgeirsson et al.
AAAI 2024paperarXiv:2310.02456
16
citations
Multimodal Label Relevance Ranking via Reinforcement Learning
Taian Guo, Taolin Zhang, Haoqian Wu et al.
ECCV 2024arXiv:2407.13221
1
citations
Stable Preference: Redefining training paradigm of human preference model for Text-to-Image Synthesis
Hanting Li, Hongjing Niu, Feng Zhao
ECCV 2024
2
citations