Poster "heavy-tailed data" Papers
2 papers found
Conference
Scaling Laws for Gradient Descent and Sign Descent for Linear Bigram Models under Zipf’s Law
Frederik Kunstner, Francis Bach
NEURIPS 2025arXiv:2505.19227
12
citations
Subsampled Ensemble Can Improve Generalization Tail Exponentially
Huajie Qian, Donghao Ying, Henry Lam et al.
NEURIPS 2025arXiv:2405.14741
1
citations