Poster "linear attention" Papers
17 papers found
Conference
Alias-Free ViT: Fractional Shift Invariance via Linear Attention
Hagay Michaeli, Daniel Soudry
NEURIPS 2025arXiv:2510.22673
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up
Songhua Liu, Zhenxiong Tan, Xinchao Wang
NEURIPS 2025arXiv:2412.16112
20
citations
Degrees of Freedom for Linear Attention: Distilling Softmax Attention with Optimal Feature Efficiency
Naoki Nishikawa, Rei Higuchi, Taiji Suzuki
NEURIPS 2025arXiv:2507.03340
1
citations
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
Yuxian Gu, Qinghao Hu, Haocheng Xi et al.
NEURIPS 2025arXiv:2508.15884
16
citations
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers
Zeyuan Allen-Zhu
NEURIPS 2025arXiv:2512.17351
12
citations
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
Weikang Meng, Yadan Luo, Xin Li et al.
ICLR 2025arXiv:2501.15061
42
citations
ThunderKittens: Simple, Fast, and $\textit{Adorable}$ Kernels
Benjamin Spector, Simran Arora, Aaryan Singhal et al.
ICLR 2025
3
citations
Agent Attention: On the Integration of Softmax and Linear Attention
Dongchen Han, Tianzhu Ye, Yizeng Han et al.
ECCV 2024arXiv:2312.08874
212
citations
AttnZero: Efficient Attention Discovery for Vision Transformers
Lujun Li, Zimian Wei, Peijie Dong et al.
ECCV 2024
14
citations
DiJiang: Efficient Large Language Models through Compact Kernelization
Hanting Chen, Liuzhicheng Liuzhicheng, Xutao Wang et al.
ICML 2024arXiv:2403.19928
11
citations
Gated Linear Attention Transformers with Hardware-Efficient Training
Songlin Yang, Bailin Wang, Yikang Shen et al.
ICML 2024arXiv:2312.06635
329
citations
Mobile Attention: Mobile-Friendly Linear-Attention for Vision Transformers
Zhiyu Yao, Jian Wang, Haixu Wu et al.
ICML 2024
ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention
Chenhang He, Ruihuang Li, Guowen Zhang et al.
ECCV 2024arXiv:2401.00912
13
citations
Short-Long Convolutions Help Hardware-Efficient Linear Attention to Focus on Long Sequences
Zicheng Liu, Siyuan Li, Li Wang et al.
ICML 2024arXiv:2406.08128
10
citations
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization
Jialong Guo, Xinghao Chen, Yehui Tang et al.
ICML 2024arXiv:2405.11582
34
citations
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
Zhen Qin, Weigao Sun, Dong Li et al.
ICML 2024arXiv:2405.17381
24
citations
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
Haoran You, Yichao Fu, Zheng Wang et al.
ICML 2024arXiv:2406.07368
9
citations