by Changyu Chen Papers
3 papers found
Conference
Bootstrapping Language Models with DPO Implicit Rewards
Changyu Chen, Zichen Liu, Chao Du et al.
ICLR 2025arXiv:2406.09760
51
citations
Efficient Process Reward Model Training via Active Learning
Keyu Duan, Zichen Liu, Xin Mao et al.
COLM 2025paperarXiv:2504.10559
9
citations
Understanding R1-Zero-Like Training: A Critical Perspective
Zichen Liu, Changyu Chen, Wenjun Li et al.
COLM 2025paperarXiv:2503.20783
714
citations