by jianwei zhang Papers
2 papers found
Conference
CateKV: On Sequential Consistency for Long-Context LLM Inference Acceleration
Haoyun Jiang, Haolin li, jianwei zhang et al.
ICML 2025
1
citations
Rotated Runtime Smooth: Training-Free Activation Smoother for accurate INT4 inference
Ke Yi, Zengke Liu, jianwei zhang et al.
ICLR 2025arXiv:2409.20361
4
citations