Poster "large language model training" Papers
2 papers found
Conference
COAT: Compressing Optimizer states and Activations for Memory-Efficient FP8 Training
Haocheng Xi, Han Cai, Ligeng Zhu et al.
ICLR 2025arXiv:2410.19313
19
citations
Understanding the Training Speedup from Sampling with Approximate Losses
Rudrajit Das, Xi Chen, Bertram Ieong et al.
ICML 2024arXiv:2402.07052
4
citations