Paper "human preference alignment" Papers
4 papers found
Conference
Beyond Human Data: Aligning Multimodal Large Language Models by Iterative Self-Evolution
Wentao Tan, Qiong Cao, Yibing Zhan et al.
AAAI 2025paperarXiv:2412.15650
7
citations
Evaluating the Evaluator: Measuring LLMs’ Adherence to Task Evaluation Instructions
Bhuvanashree Murugadoss, Christian Poelitz, Ian Drosos et al.
AAAI 2025paperarXiv:2408.08781
39
citations
Radiology Report Generation via Multi-objective Preference Optimization
Ting Xiao, Lei Shi, Peng Liu et al.
AAAI 2025paperarXiv:2412.08901
10
citations
Explore 3D Dance Generation via Reward Model from Automatically-Ranked Demonstrations
Zilin Wang, Haolin Zhuang, Lu Li et al.
AAAI 2024paperarXiv:2312.11442
5
citations