Poster "human feedback alignment" Papers

10 papers found

Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models

Yongjin Yang, Sihyeon Kim, Hojung Jung et al.

ICLR 2025arXiv:2410.10166
3
citations

Curriculum Direct Preference Optimization for Diffusion and Consistency Models

Florinel Croitoru, Vlad Hondru, Radu Tudor Ionescu et al.

CVPR 2025arXiv:2405.13637
24
citations

Generative RLHF-V: Learning Principles from Multi-modal Human Preference

Jiayi Zhou, Jiaming Ji, Boyuan Chen et al.

NEURIPS 2025arXiv:2505.18531
7
citations

Reward Learning from Multiple Feedback Types

Yannick Metz, Andras Geiszl, Raphaël Baur et al.

ICLR 2025arXiv:2502.21038
5
citations

Scalable Valuation of Human Feedback through Provably Robust Model Alignment

Masahiro Fujisawa, Masaki Adachi, Michael A Osborne

NEURIPS 2025arXiv:2505.17859
1
citations

Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models

Ángela López-Cardona, Carlos Segura, Alexandros Karatzoglou et al.

ICLR 2025arXiv:2410.01532
8
citations

Improving Prototypical Visual Explanations with Reward Reweighing, Reselection, and Retraining

Aaron Li, Robin Netzorg, Zhihan Cheng et al.

ICML 2024arXiv:2307.03887
4
citations

Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF

Banghua Zhu, Michael Jordan, Jiantao Jiao

ICML 2024arXiv:2401.16335
48
citations

Safeguard Text-to-Image Diffusion Models with Human Feedback Inversion

Sanghyun Kim, Seohyeon Jung, Balhae Kim et al.

ECCV 2024arXiv:2407.21032
10
citations

ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback

Ganqu Cui, Lifan Yuan, Ning Ding et al.

ICML 2024arXiv:2310.01377
214
citations