by Chenheng Zhang Papers
3 papers found
Conference
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
Zhoutong Wu, Yuan Zhang, Yiming Dong et al.
NEURIPS 2025arXiv:2510.16807
Language Ranker: A Lightweight Ranking framework for LLM Decoding
Chenheng Zhang, Tianqi Du, Jizhe Zhang et al.
NEURIPS 2025arXiv:2510.21883
What is Wrong with Perplexity for Long-context Language Modeling?
Lizhe Fang, Yifei Wang, Zhaoyang Liu et al.
ICLR 2025arXiv:2410.23771
39
citations