"rank collapse" Papers
6 papers found
Conference
From Condensation to Rank Collapse: A Two-Stage Analysis of Transformer Training Dynamics
Zheng-An Chen, Tao Luo
NEURIPS 2025oralarXiv:2510.06954
1
citations
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Federico Arangath Joseph, Jerome Sieber, Melanie Zeilinger et al.
ICLR 2025arXiv:2410.10609
2
citations
TF-MAS: Training-free Mamba2 Architecture Search
Yi Fan, Yu-Bin Yang
NEURIPS 2025
PIDformer: Transformer Meets Control Theory
Tam Nguyen, Cesar Uribe, Tan Nguyen et al.
ICML 2024arXiv:2402.15989
12
citations
Self-attention Networks Localize When QK-eigenspectrum Concentrates
Han Bao, Ryuichiro Hataya, Ryo Karakida
ICML 2024arXiv:2402.02098
11
citations
Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models
Akhil Kedia, Mohd Abbas Zaidi, Sushil Khyalia et al.
ICML 2024arXiv:2403.09635
11
citations