Spotlight "scaling laws" Papers
8 papers found
Conference
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
Zachary Charles, Gabriel Teston, Lucio Dery et al.
NEURIPS 2025spotlightarXiv:2503.09799
14
citations
Data Mixing Can Induce Phase Transitions in Knowledge Acquisition
Xinran Gu, Kaifeng Lyu, Jiazheng Li et al.
NEURIPS 2025spotlightarXiv:2505.18091
2
citations
Language Modeling by Language Models
Junyan Cheng, Peter Clark, Kyle Richardson
NEURIPS 2025spotlightarXiv:2506.20249
3
citations
Predictable Scale (Part II) --- Farseer: A Refined Scaling Law in LLMs
Houyi Li, Wenzhen Zheng, Qiufeng Wang et al.
NEURIPS 2025spotlight
Scaling Laws For Scalable Oversight
Joshua Engels, David Baek, Subhash Kantamneni et al.
NEURIPS 2025spotlightarXiv:2504.18530
4
citations
Understanding LLM Behaviors via Compression: Data Generation, Knowledge Acquisition and Scaling Laws
Zhixuan Pan, Shaowen Wang, Liao Pengfei et al.
NEURIPS 2025spotlightarXiv:2504.09597
6
citations
Mixtures of Experts Unlock Parameter Scaling for Deep RL
Johan Obando Ceron, Ghada Sokar, Timon Willi et al.
ICML 2024spotlightarXiv:2402.08609
64
citations
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
Sotiris Anagnostidis, Gregor Bachmann, Imanol Schlag et al.
ICML 2024spotlightarXiv:2311.03233
2
citations