by Shuozhe Li Papers
2 papers found
Conference
An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning
Haoran Xu, Shuozhe Li, Harshit Sikchi et al.
ICLR 2025arXiv:2504.13368
3
citations
ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning
Ruiyang Zhou, Shuozhe Li, Amy Zhang et al.
NEURIPS 2025arXiv:2507.02834
5
citations