"sequence length scaling" Papers
3 papers found
Conference
Flash Invariant Point Attention
Andrew Liu, Axel Elaldi, Nicholas Franklin et al.
NEURIPS 2025spotlightarXiv:2505.11580
StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs
Qijun Luo, Mengqi Li, Lei Zhao et al.
NEURIPS 2025arXiv:2506.03077
1
citations
Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT
Jon Saad-Falcon, Daniel Y Fu, Simran Arora et al.
ICML 2024arXiv:2402.07440
23
citations