α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Lihong Li
Lihong Li
5
papers
214
total citations
papers (5)
Off-Policy Evaluation via the Regularized Lagrangian
NEURIPS 2020
arXiv
121
citations
CoinDICE: Off-Policy Confidence Interval Estimation
NEURIPS 2020
arXiv
89
citations
Ask a Strong LLM Judge when Your Reward Model is Uncertain
NEURIPS 2025
arXiv
3
citations
Self-Rewarding PPO: Aligning Large Language Models with Demonstrations Only
COLM 2025
arXiv
1
citations
Escaping the Gravitational Pull of Softmax
NEURIPS 2020
0
citations