"model scaling" Papers
13 papers found
Conference
Calibrating Large Language Models with Sample Consistency
Qing Lyu, Kumar Shridhar, Chaitanya Malaviya et al.
AAAI 2025paperarXiv:2402.13904
52
citations
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for LLM Problem-Solving
Yangzhen Wu, Zhiqing Sun, Shanda Li et al.
ICLR 2025
146
citations
Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning
Xiaolei Wang, Xinyu Tang, Junyi Li et al.
ICLR 2025arXiv:2406.14022
6
citations
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Rui Dai, Sile Hu, Xu Shen et al.
ICLR 2025arXiv:2504.10902
9
citations
(Mis)Fitting Scaling Laws: A Survey of Scaling Law Fitting Techniques in Deep Learning
Margaret Li, Sneha Kudugunta, Luke Zettlemoyer
ICLR 2025
9
citations
PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs
Oskar van der Wal, Pietro Lesci, Max Müller-Eberstein et al.
ICLR 2025arXiv:2503.09543
16
citations
RAST: Reasoning Activation in LLMs via Small-model Transfer
Siru Ouyang, Xinyu Zhu, Zilin Xiao et al.
NEURIPS 2025arXiv:2506.15710
2
citations
Scaling and context steer LLMs along the same computational path as the human brain
Joséphine Raugel, Jérémy Rapin, Stéphane d'Ascoli et al.
NEURIPS 2025oralarXiv:2512.01591
Should VLMs be Pre-trained with Image Data?
Sedrick Keh, Jean Mercat, Samir Yitzhak Gadre et al.
ICLR 2025arXiv:2503.07603
TorchTitan: One-stop PyTorch native solution for production ready LLM pretraining
Wanchao Liang, Tianyu Liu, Less Wright et al.
ICLR 2025
53
citations
Differentiable Model Scaling using Differentiable Topk
Kai Liu, Ruohui Wang, Jianfei Gao et al.
ICML 2024arXiv:2405.07194
4
citations
Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models
Francesca-Zhoufan Li, Ava Amini, Yisong Yue et al.
ICML 2024
LoRA+: Efficient Low Rank Adaptation of Large Models
Soufiane Hayou, Nikhil Ghosh, Bin Yu
ICML 2024arXiv:2402.12354
341
citations