Poster "induction heads" Papers
3 papers found
Conference
Selective induction Heads: How Transformers Select Causal Structures in Context
Francesco D'Angelo, francesco croce, Nicolas Flammarion
ICLR 2025arXiv:2509.08184
6
citations
Better & Faster Large Language Models via Multi-token Prediction
Fabian Gloeckle, Badr Youbi Idrissi, Baptiste Roziere et al.
ICML 2024arXiv:2404.19737
232
citations
How Transformers Learn Causal Structure with Gradient Descent
Eshaan Nichani, Alex Damian, Jason Lee
ICML 2024arXiv:2402.14735
102
citations