"parameter initialization" Papers
6 papers found
Conference
Let Me Grok for You: Accelerating Grokking via Embedding Transfer from a Weaker Model
Zhiwei Xu, Zhiyu Ni, Yixin Wang et al.
ICLR 2025arXiv:2504.13292
6
citations
Sign-In to the Lottery: Reparameterizing Sparse Training
Advait Gadhikar, Tom Jacobs, chao zhou et al.
NEURIPS 2025arXiv:2504.12801
1
citations
Big Learning Expectation Maximization
Yulai Cong, Sijia Li
AAAI 2024paperarXiv:2312.11926
4
citations
Idling Neurons, Appropriately Lenient Workload During Fine-tuning Leads to Better Generalization
Hongjing Niu, Hanting Li, Bin Li et al.
ECCV 2024
Rethinking Optimization and Architecture for Tiny Language Models
Yehui Tang, Kai Han, Fangcheng Liu et al.
ICML 2024
Stability-Informed Initialization of Neural Ordinary Differential Equations
Theodor Westny, Arman Mohammadi, Daniel Jung et al.
ICML 2024arXiv:2311.15890
4
citations