"learning dynamics" Papers

10 papers found

A Solvable Attention for Neural Scaling Laws

Bochen Lyu, Di Wang, Zhanxing Zhu

ICLR 2025
5
citations

Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient

George Wang, Jesse Hoogland, Stan van Wingerden et al.

ICLR 2025arXiv:2410.02984
24
citations

How do language models learn facts? Dynamics, curricula and hallucinations

Nicolas Zucchet, Jorg Bornschein, Stephanie C.Y. Chan et al.

COLM 2025paper
21
citations

Learning Dynamics of LLM Finetuning

YI REN, Danica Sutherland

ICLR 2025arXiv:2407.10490
67
citations

What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains

Chanakya Ekbote, Ashok Vardhan Makkuva, Marco Bondaschi et al.

NEURIPS 2025spotlightarXiv:2508.07208
1
citations

Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes

Nabeel Seedat, Nicolas Huynh, Boris van Breugel et al.

ICML 2024arXiv:2312.12112
51
citations

Explaining Generalization Power of a DNN Using Interactive Concepts

Huilin Zhou, Hao Zhang, Huiqi Deng et al.

AAAI 2024paperarXiv:2302.13091
33
citations

Impact of Decentralized Learning on Player Utilities in Stackelberg Games

Kate Donahue, Nicole Immorlica, Meena Jagadeesan et al.

ICML 2024arXiv:2403.00188
6
citations

Prediction Accuracy of Learning in Games : Follow-the-Regularized-Leader meets Heisenberg

Yi Feng, Georgios Piliouras, Xiao Wang

ICML 2024arXiv:2406.10603
2
citations

Self-attention Networks Localize When QK-eigenspectrum Concentrates

Han Bao, Ryuichiro Hataya, Ryo Karakida

ICML 2024arXiv:2402.02098
11
citations