"kv cache eviction" Papers
3 papers found
Conference
Accurate KV Cache Eviction via Anchor Direction Projection for Efficient LLM Inference
Zijie Geng, Jie Wang, Ziqi Liu et al.
NEURIPS 2025
KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments
Junyoung Park, Dalton Jones, Matthew Morse et al.
NEURIPS 2025arXiv:2504.15364
17
citations
MPCache: MPC-Friendly KV Cache Eviction for Efficient Private LLM Inference
Wenxuan Zeng, Ye Dong, Jinjin Zhou et al.
NEURIPS 2025arXiv:2501.06807
2
citations