by Diyuan Wu Papers
2 papers found
Conference
Attention with Trained Embeddings Provably Selects Important Tokens
Diyuan Wu, Aleksandr Shevchenko, Samet Oymak et al.
NEURIPS 2025arXiv:2505.17282
Neural Collapse Beyond the Unconstrained Features Model: Landscape, Dynamics, and Generalization in the Mean-Field Regime
Diyuan Wu, Marco Mondelli
ICML 2025spotlightarXiv:2501.19104
1
citations