by Fan Lai Papers
3 papers found
Conference
Act Only When It Pays: Efficient Reinforcement Learning for LLM Reasoning via Selective Rollouts
Haizhong Zheng, Yang Zhou, Brian Bartoldson et al.
NEURIPS 2025oralarXiv:2506.02177
44
citations
HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location
Ting Sun, Penghan Wang, Fan Lai
NEURIPS 2025arXiv:2501.14808
7
citations
Inv-Entropy: A Fully Probabilistic Framework for Uncertainty Quantification in Language Models
Haoyi Song, Ruihan Ji, Naichen Shi et al.
NEURIPS 2025arXiv:2506.09684
2
citations