Poster "pretraining data analysis" Papers
2 papers found
Conference
Generalization v.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data
Xinyi Wang, Antonis Antoniades, Yanai Elazar et al.
ICLR 2025arXiv:2407.14985
80
citations
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Laura Ruis, Maximilian Mozes, Juhan Bae et al.
ICLR 2025arXiv:2411.12580
28
citations