by Rongzhi Zhang Papers
2 papers found
Conference
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
Rushi Qiang, Yuchen Zhuang, Yinghao Li et al.
NEURIPS 2025arXiv:2505.07782
7
citations
Self-Rewarding PPO: Aligning Large Language Models with Demonstrations Only
Qingru Zhang, Liang Qiu, Ilgee Hong et al.
COLM 2025paperarXiv:2510.21090
1
citations