"parameter scaling" Papers
6 papers found
Conference
EvoLM: In Search of Lost Language Model Training Dynamics
Zhenting Qi, Fan Nie, Alexandre Alahi et al.
NEURIPS 2025oralarXiv:2506.16029
5
citations
Gemstones: A Model Suite for Multi-Faceted Scaling Laws
Sean McLeish, John Kirchenbauer, David Miller et al.
NEURIPS 2025arXiv:2502.06857
10
citations
Language models scale reliably with over-training and on downstream tasks
Samir Yitzhak Gadre, Georgios Smyrnis, Vaishaal Shankar et al.
ICLR 2025arXiv:2403.08540
79
citations
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning
Xianhang Li, Yanqing Liu, Haoqin Tu et al.
ICCV 2025arXiv:2505.04601
6
citations
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Jonas Geiping, Sean McLeish, Neel Jain et al.
NEURIPS 2025spotlightarXiv:2502.05171
158
citations
Mixtures of Experts Unlock Parameter Scaling for Deep RL
Johan Obando Ceron, Ghada Sokar, Timon Willi et al.
ICML 2024spotlightarXiv:2402.08609
64
citations