"transformer language models" Papers

9 papers found

Filters:transformer language models Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models

Tyler Chang, Benjamin Bergen

NEURIPS 2025spotlightarXiv:2504.15471

Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient

George Wang, Jesse Hoogland, Stan van Wingerden et al.

ICLR 2025arXiv:2410.02984

Extrapolation by Association: Length Generalization Transfer In Transformers

Ziyang Cai, Nayoung Lee, Avi Schwarzschild et al.

NEURIPS 2025spotlightarXiv:2506.09251

How to Scale Second-Order Optimization

Charlie Chen, Shikai Qiu, Hoang Phan et al.

Matrix Product Sketching via Coordinated Sampling

Majid Daliri, Juliana Freire, Danrong Li et al.

ICLR 2025arXiv:2501.17836

Relation Also Knows: Rethinking the Recall and Editing of Factual Associations in Auto-Regressive Transformer Language Models

Xiyu Liu, Zhengxiao Liu, Naibin Gu et al.

AAAI 2025paperarXiv:2408.15091

Residual Stream Analysis with Multi-Layer SAEs

Tim Lawson, Lucy Farnik, Conor Houghton et al.

ICLR 2025arXiv:2409.04185

Codebook Features: Sparse and Discrete Interpretability for Neural Networks

Alex Tamkin, Mohammad Taufeeque, Noah Goodman

ICML 2024arXiv:2310.17230

Observable Propagation: Uncovering Feature Vectors in Transformers

Jacob Dunefsky, Arman Cohan

ICML 2024arXiv:2312.16291