α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Hadi Daneshmand
Hadi Daneshmand
8
papers
333
total citations
papers (8)
Transformers learn to implement preconditioned gradient descent for in-context learning
NEURIPS 2023
arXiv
252
citations
Batch Normalization Orthogonalizes Representations in Deep Random Networks
NEURIPS 2021
arXiv
40
citations
Transformers Can Learn Temporal Difference Methods for In-Context Reinforcement Learning
ICLR 2025
arXiv
15
citations
On the impact of activation and normalization in obtaining isometric embeddings at initialization
NEURIPS 2023
arXiv
12
citations
Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion
ICLR 2024
arXiv
9
citations
Batch normalization provably avoids ranks collapse for randomly initialised deep networks
NEURIPS 2020
arXiv
4
citations
Linear Transformers Implicitly Discover Unified Numerical Algorithms
NEURIPS 2025
arXiv
1
citations
Rethinking the Variational Interpretation of Accelerated Optimization Methods
NEURIPS 2021
0
citations