Poster "language model generalization" Papers
2 papers found
Conference
Generalization v.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data
Xinyi Wang, Antonis Antoniades, Yanai Elazar et al.
ICLR 2025arXiv:2407.14985
80
citations
To Code or Not To Code? Exploring Impact of Code in Pre-training
Viraat Aryabumi, Yixuan Su, Raymond Ma et al.
ICLR 2025arXiv:2408.10914
44
citations