"long-context generation" Papers
6 papers found
Conference
APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding
Xinyu Yang, Tianqi Chen, Beidi Chen
ICLR 2025arXiv:2502.05431
16
citations
A Training-Free Sub-quadratic Cost Transformer Model Serving Framework with Hierarchically Pruned Attention
Heejun Lee, Geon Park, Youngwan Lee et al.
ICLR 2025arXiv:2406.09827
9
citations
LASeR: Learning to Adaptively Select Reward Models with Multi-Arm Bandits
Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin et al.
NEURIPS 2025arXiv:2410.01735
6
citations
Learning to Reason for Long-Form Story Generation
Alexander Gurung, Mirella Lapata
COLM 2025paper
19
citations
PolarQuant: Leveraging Polar Transformation for Key Cache Quantization and Decoding Acceleration
Songhao Wu, Ang Lv, xiao feng et al.
NEURIPS 2025
Streaming Attention Approximation via Discrepancy Theory
Ekaterina Kochetkova, Kshiteej Jitesh Sheth, Insu Han et al.
NEURIPS 2025spotlightarXiv:2502.07861
2
citations