"kernel regime" Papers
2 papers found
Conference
Asymptotics of feature learning in two-layer networks after one gradient-step
Hugo Cui, Luca Pesce, Yatin Dandi et al.
ICML 2024spotlightarXiv:2402.04980
26
citations
Why Do You Grok? A Theoretical Analysis on Grokking Modular Addition
Mohamad Amin Mohamadi, Zhiyuan Li, Lei Wu et al.
ICML 2024arXiv:2407.12332
19
citations