"human value alignment" Papers
3 papers found
Conference
Anyprefer: An Agentic Framework for Preference Data Synthesis
Yiyang Zhou, Zhaoyang Wang, Tianle Wang et al.
ICLR 2025arXiv:2504.19276
11
citations
MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference Optimization
Yougang Lyu, Lingyong Yan, Zihan Wang et al.
ICLR 2025oralarXiv:2410.07672
16
citations
Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation
Xianghe Pang, shuo tang, Rui Ye et al.
ICML 2024spotlightarXiv:2402.05699
48
citations