by Zaifeng Pan Papers
2 papers found
Conference
KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows
Zaifeng Pan, AJJKUMAR DAHYALAL PATEL, Yipeng Shen et al.
NEURIPS 2025oralarXiv:2507.07400
9
citations
Yggdrasil: Bridging Dynamic Speculation and Static Runtime for Latency-Optimal Tree-Based LLM Decoding
Yue Guan, Changming Yu, Shihan Fang et al.
NEURIPS 2025arXiv:2512.23858
1
citations