"batch size optimization" Papers
4 papers found
Conference
Power Lines: Scaling laws for weight decay and batch size in LLM pre-training
Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.
NEURIPS 2025arXiv:2505.13738
17
citations
Online Learning and Information Exponents: The Importance of Batch size & Time/Complexity Tradeoffs
Luca Arnaboldi, Yatin Dandi, FLORENT KRZAKALA et al.
ICML 2024
Poly-View Contrastive Learning
Amitis Shidani, R Devon Hjelm, Jason Ramapuram et al.
ICLR 2024arXiv:2403.05490
9
citations
Subsampling is not Magic: Why Large Batch Sizes Work for Differentially Private Stochastic Optimisation
Ossi Räisä, Joonas Jälkö, Antti Honkela
ICML 2024arXiv:2402.03990
8
citations