"continued pre-training" Papers
3 papers found
Conference
Advancing Mathematical Reasoning in Language Models: The Impact of Problem-Solving Data, Data Synthesis Methods, and Training Stages
Zui Chen, Tianqiao Liu, Tongqing et al.
ICLR 2025arXiv:2501.14002
12
citations
EvoLM: In Search of Lost Language Model Training Dynamics
Zhenting Qi, Fan Nie, Alexandre Alahi et al.
NEURIPS 2025oralarXiv:2506.16029
5
citations
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
Piotr Nawrot, Adrian Łańcucki, Marcin Chochowski et al.
ICML 2024arXiv:2403.09636
94
citations