α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Shihan Dou
Shihan Dou
5
papers
130
total citations
papers (5)
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
ICML 2024
arXiv
58
citations
RMB: Comprehensively benchmarking reward models in LLM alignment
ICLR 2025
arXiv
47
citations
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
ICML 2024
arXiv
21
citations
EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving
NEURIPS 2025
arXiv
4
citations
Alleviating Shifted Distribution in Human Preference Alignment through Meta-Learning
AAAI 2025
0
citations