Spotlight "long-context llms" Papers
2 papers found
Conference
MoBA: Mixture of Block Attention for Long-Context LLMs
Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.
NEURIPS 2025spotlightarXiv:2502.13189
109
citations
Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning
Chaofan Lin, Jiaming Tang, Shuo Yang et al.
NEURIPS 2025spotlightarXiv:2502.02770
14
citations