Poster "long context generation" Papers
2 papers found
Conference
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
Ranajoy Sadhukhan, Jian Chen, Zhuoming Chen et al.
ICLR 2025arXiv:2408.11049
64
citations
When Attention Sink Emerges in Language Models: An Empirical View
Xiangming Gu, Tianyu Pang, Chao Du et al.
ICLR 2025arXiv:2410.10781
98
citations