Poster "gradient noise analysis" Papers
2 papers found
Conference
Adaptive Methods through the Lens of SDEs: Theoretical Insights on the Role of Noise
Enea Monzio Compagnoni, Tianlin Liu, Rustem Islamov et al.
ICLR 2025arXiv:2411.15958
17
citations
Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers
Lei Chen, Joan Bruna, Alberto Bietti
ICLR 2025arXiv:2406.03068
8
citations