"loss landscape geometry" Papers
6 papers found
Conference
Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient
George Wang, Jesse Hoogland, Stan van Wingerden et al.
ICLR 2025arXiv:2410.02984
24
citations
SAD Neural Networks: Divergent Gradient Flows and Asymptotic Optimality via o-minimal Structures
Julian Kranz, Davide Gallon, Steffen Dereich et al.
NEURIPS 2025arXiv:2505.09572
4
citations
Transformers Learn Low Sensitivity Functions: Investigations and Implications
Bhavya Vasudeva, Deqing Fu, Tianyi Zhou et al.
ICLR 2025arXiv:2403.06925
8
citations
Understanding Optimization in Deep Learning with Central Flows
Jeremy Cohen, Alex Damian, Ameet Talwalkar et al.
ICLR 2025arXiv:2410.24206
22
citations
Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis
Stefan Horoi, Albert Manuel Orozco Camacho, Eugene Belilovsky et al.
ICML 2024arXiv:2407.05385
12
citations
Simplicity Bias via Global Convergence of Sharpness Minimization
Khashayar Gatmiry, Zhiyuan Li, Sashank J. Reddi et al.
ICML 2024arXiv:2410.16401
3
citations