α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Zhiyu Mei
Zhiyu Mei
3
papers
379
total citations
papers (3)
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
ICML 2024
arXiv
253
citations
AREAL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
NEURIPS 2025
arXiv
117
citations
SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores
ICLR 2024
arXiv
9
citations