"communication-efficient training" Papers
2 papers found
Conference
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
Zachary Charles, Gabriel Teston, Lucio Dery et al.
NEURIPS 2025spotlightarXiv:2503.09799
14
citations
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob, Lorenzo Sani, Meghdad Kurmanji et al.
ICLR 2025arXiv:2410.05021
2
citations