"sparse attention" Papers

23 papers found

Adaptive Computation Pruning for the Forgetting Transformer

Zhixuan Lin, Johan Obando-Ceron, Xu Owen He et al.

COLM 2025paperarXiv:2504.06949
3
citations

DuSA: Fast and Accurate Dual-Stage Sparse Attention Mechanism Accelerating Both Training and Inference

Chong Wu, Jiawang Cao, Renjie Xu et al.

NEURIPS 2025

FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference

Xunhao Lai, Jianqiao Lu, Yao Luo et al.

ICLR 2025arXiv:2502.20766
62
citations

Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access

Xiang Hu, Jiaqi Leng, Jun Zhao et al.

NEURIPS 2025arXiv:2504.16795
3
citations

Inference-Time Hyper-Scaling with KV Cache Compression

Adrian Łańcucki, Konrad Staniszewski, Piotr Nawrot et al.

NEURIPS 2025arXiv:2506.05345
17
citations

Kinetics: Rethinking Test-Time Scaling Law

Ranajoy Sadhukhan, Zhuoming Chen, Haizhong Zheng et al.

NEURIPS 2025arXiv:2506.05333
8
citations

LevAttention: Time, Space and Streaming Efficient Algorithm for Heavy Attentions

Ravindran Kannan, Chiranjib Bhattacharyya, Praneeth Kacham et al.

ICLR 2025arXiv:2410.05462
2
citations

MagicPIG: LSH Sampling for Efficient LLM Generation

Zhuoming Chen, Ranajoy Sadhukhan, Zihao Ye et al.

ICLR 2025arXiv:2410.16179
69
citations

MoBA: Mixture of Block Attention for Long-Context LLMs

Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.

NEURIPS 2025spotlightarXiv:2502.13189
109
citations

Overcoming Long Context Limitations of State Space Models via Context Dependent Sparse Attention

Zhihao Zhan, Jianan Zhao, Zhaocheng Zhu et al.

NEURIPS 2025

PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity

Kwanyoung Kim, Byeongsu Sim

ICCV 2025arXiv:2503.07677
1
citations

Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape

Ruichen Chen, Keith Mills, Liyao Jiang et al.

NEURIPS 2025oralarXiv:2505.22918
1
citations

SALS: Sparse Attention in Latent Space for KV Cache Compression

Junlin Mu, Hantao Huang, Jihang Zhang et al.

NEURIPS 2025arXiv:2510.24273

SCENT: Robust Spatiotemporal Learning for Continuous Scientific Data via Scalable Conditioned Neural Fields

David K Park, Xihaier Luo, Guang Zhao et al.

ICML 2025oralarXiv:2504.12262
1
citations

Spark Transformer: Reactivating Sparsity in Transformer FFN and Attention

Chong You, Kan Wu, Zhipeng Jia et al.

NEURIPS 2025
2
citations

Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation

Shuo Yang, Haocheng Xi, Yilong Zhao et al.

NEURIPS 2025spotlightarXiv:2505.18875
40
citations

The emergence of sparse attention: impact of data distribution and benefits of repetition

Nicolas Zucchet, Francesco D'Angelo, Andrew Lampinen et al.

NEURIPS 2025oralarXiv:2505.17863
7
citations

Training-free and Adaptive Sparse Attention for Efficient Long Video Generation

yifei xia, Suhan Ling, Fangcheng Fu et al.

ICCV 2025arXiv:2502.21079
35
citations

Transformers Learn Faster with Semantic Focus

Parikshit Ram, Kenneth Clarkson, Tim Klinger et al.

NEURIPS 2025arXiv:2506.14095

Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning

Chaofan Lin, Jiaming Tang, Shuo Yang et al.

NEURIPS 2025spotlightarXiv:2502.02770
14
citations

VarDrop: Enhancing Training Efficiency by Reducing Variate Redundancy in Periodic Time Series Forecasting

Junhyeok Kang, Yooju Shin, Jae-Gil Lee

AAAI 2025paperarXiv:2501.14183
3
citations

Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring

Huicong Zhang, Haozhe Xie, Hongxun Yao

CVPR 2024arXiv:2406.07551
18
citations

MultiMax: Sparse and Multi-Modal Attention Learning

Yuxuan Zhou, Mario Fritz, Margret Keuper

ICML 2024arXiv:2406.01189
1
citations