"length extrapolation" Papers
7 papers found
Conference
DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products
Julien Siems, Timur Carstensen, Arber Zela et al.
NEURIPS 2025arXiv:2502.10297
26
citations
DocMamba: Efficient Document Pre-training with State Space Model
Pengfei Hu, Zhenrong Zhang, Jiefeng Ma et al.
AAAI 2025paperarXiv:2409.11887
2
citations
Hansel: Output Length Controlling Framework for Large Language Models
Seoha Song, Junhyun Lee, Hyeonmok Ko
AAAI 2025paperarXiv:2412.14033
1
citations
LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding
Shen Zhang, Siyuan Liang, Yaning Tan et al.
NEURIPS 2025arXiv:2503.04344
1
citations
Plug, Play, and Generalize: Length Extrapolation with Pointer-Augmented Neural Memory
Svetha Venkatesh, Kien Do, Hung Le et al.
ICLR 2025
Exploring Transformer Extrapolation
Zhen Qin, Yiran Zhong, Hui Deng
AAAI 2024paperarXiv:2307.10156
12
citations
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation
Zhenyu He, Guhao Feng, Shengjie Luo et al.
ICML 2024arXiv:2401.16421
23
citations