α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Shane Bergsma
Shane Bergsma
2
papers
41
total citations
papers (2)
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
ICLR 2025
arXiv
24
citations
Power Lines: Scaling laws for weight decay and batch size in LLM pre-training
NEURIPS 2025
arXiv
17
citations