"adaptive optimizers" Papers
5 papers found
Conference
From Attention to Activation: Unraveling the Enigmas of Large Language Models
Prannay Kaul, Chengcheng Ma, Ismail Elezi et al.
ICLR 2025arXiv:2410.17174
8
citations
Some Optimizers are More Equal: Understanding the Role of Optimizers in Group Fairness
Mojtaba Kolahdouzi, Hatice Gunes, Ali Etemad
NEURIPS 2025spotlightarXiv:2504.14882
Understanding Optimization in Deep Learning with Central Flows
Jeremy Cohen, Alex Damian, Ameet Talwalkar et al.
ICLR 2025arXiv:2410.24206
22
citations
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
Weronika Ormaniec, Felix Dangel, Sidak Pal Singh
ICLR 2025arXiv:2410.10986
10
citations
MADA: Meta-Adaptive Optimizers Through Hyper-Gradient Descent
Kaan Ozkara, Can Karakus, Parameswaran Raman et al.
ICML 2024arXiv:2401.08893
6
citations