"attention sparsity" Papers
4 papers found
Conference
CASP: Compression of Large Multimodal Models Based on Attention Sparsity
Mohsen Gholami, Mohammad Akbari, Kevin Cannons et al.
CVPR 2025highlightarXiv:2503.05936
4
citations
Scale-invariant attention
Ben Anson, Xi Wang, Laurence Aitchison
NEURIPS 2025arXiv:2505.17083
2
citations
SeerAttention: Self-distilled Attention Gating for Efficient Long-context Prefilling
Yizhao Gao, Zhichen Zeng, DaYou Du et al.
NEURIPS 2025
Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning
Chaofan Lin, Jiaming Tang, Shuo Yang et al.
NEURIPS 2025spotlightarXiv:2502.02770
14
citations