by wenqi chen Papers
2 papers found
Conference
Generative RLHF-V: Learning Principles from Multi-modal Human Preference
Jiayi Zhou, Jiaming Ji, Boyuan Chen et al.
NEURIPS 2025arXiv:2505.18531
7
citations
InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback
Boyuan Chen, Donghai Hong, Jiaming Ji et al.
NEURIPS 2025spotlightarXiv:2505.23950
1
citations