Poster "generative inference" Papers
3 papers found
Conference
HexGen-2: Disaggregated Generative Inference of LLMs in Heterogeneous Environment
YOUHE JIANG, Ran Yan, Binhang Yuan
ICLR 2025arXiv:2502.07903
21
citations
HexGen: Generative Inference of Large Language Model over Heterogeneous Environment
Youhe Jiang, Ran Yan, Xiaozhe Yao et al.
ICML 2024arXiv:2311.11514
34
citations
SqueezeLLM: Dense-and-Sparse Quantization
Sehoon Kim, Coleman Hooper, Amir Gholaminejad et al.
ICML 2024arXiv:2306.07629
272
citations