"long-context llms" Papers
6 papers found
Conference
Hierachical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM
Yongqiang Yao, Jingru Tan, Kaihuan Liang et al.
NEURIPS 2025
2
citations
Inference Scaling for Long-Context Retrieval Augmented Generation
Zhenrui Yue, Honglei Zhuang, Aijun Bai et al.
ICLR 2025arXiv:2410.04343
54
citations
MoBA: Mixture of Block Attention for Long-Context LLMs
Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.
NEURIPS 2025spotlightarXiv:2502.13189
109
citations
Needle Threading: Can LLMs Follow Threads Through Near-Million-Scale Haystacks?
Jonathan Roberts, Kai Han, Samuel Albanie
ICLR 2025arXiv:2411.05000
10
citations
SALS: Sparse Attention in Latent Space for KV Cache Compression
Junlin Mu, Hantao Huang, Jihang Zhang et al.
NEURIPS 2025arXiv:2510.24273
Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning
Chaofan Lin, Jiaming Tang, Shuo Yang et al.
NEURIPS 2025spotlightarXiv:2502.02770
14
citations