"model scaling" Papers

13 papers found

Calibrating Large Language Models with Sample Consistency

Qing Lyu, Kumar Shridhar, Chaitanya Malaviya et al.

AAAI 2025paperarXiv:2402.13904
52
citations

Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for LLM Problem-Solving

Yangzhen Wu, Zhiqing Sun, Shanda Li et al.

ICLR 2025
146
citations

Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning

Xiaolei Wang, Xinyu Tang, Junyi Li et al.

ICLR 2025arXiv:2406.14022
6
citations

Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs

Rui Dai, Sile Hu, Xu Shen et al.

ICLR 2025arXiv:2504.10902
9
citations

(Mis)Fitting Scaling Laws: A Survey of Scaling Law Fitting Techniques in Deep Learning

Margaret Li, Sneha Kudugunta, Luke Zettlemoyer

ICLR 2025
9
citations

PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs

Oskar van der Wal, Pietro Lesci, Max Müller-Eberstein et al.

ICLR 2025arXiv:2503.09543
16
citations

RAST: Reasoning Activation in LLMs via Small-model Transfer

Siru Ouyang, Xinyu Zhu, Zilin Xiao et al.

NEURIPS 2025arXiv:2506.15710
2
citations

Scaling and context steer LLMs along the same computational path as the human brain

Joséphine Raugel, Jérémy Rapin, Stéphane d'Ascoli et al.

NEURIPS 2025oralarXiv:2512.01591

Should VLMs be Pre-trained with Image Data?

Sedrick Keh, Jean Mercat, Samir Yitzhak Gadre et al.

ICLR 2025arXiv:2503.07603

TorchTitan: One-stop PyTorch native solution for production ready LLM pretraining

Wanchao Liang, Tianyu Liu, Less Wright et al.

ICLR 2025
53
citations

Differentiable Model Scaling using Differentiable Topk

Kai Liu, Ruohui Wang, Jianfei Gao et al.

ICML 2024arXiv:2405.07194
4
citations

Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models

Francesca-Zhoufan Li, Ava Amini, Yisong Yue et al.

ICML 2024

LoRA+: Efficient Low Rank Adaptation of Large Models

Soufiane Hayou, Nikhil Ghosh, Bin Yu

ICML 2024arXiv:2402.12354
341
citations