by Surin Ahn Papers
3 papers found
Conference
MMInference: Accelerating Pre-filling for Long-Context Visual Language Models via Modality-Aware Permutation Sparse Attention
Yucheng Li, Huiqiang Jiang, Chengruidong Zhang et al.
ICML 2025oral
SCBench: A KV Cache-Centric Analysis of Long-Context Methods
Yucheng Li, Huiqiang Jiang, Qianhui Wu et al.
ICLR 2025arXiv:2412.10319
38
citations
SecurityLingua: Efficient Defense of LLM Jailbreak Attacks via Security-Aware Prompt Compression
Yucheng Li, Surin Ahn, Huiqiang Jiang et al.
COLM 2025paperarXiv:2506.12707
5
citations