"attention computation" Papers
4 papers found
Conference
Achieving More with Less: Additive Prompt Tuning for Rehearsal-Free Class-Incremental Learning
Haoran Chen, Ping Wang, Zihan Zhou et al.
ICCV 2025arXiv:2503.07979
1
citations
Hierachical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM
Yongqiang Yao, Jingru Tan, Kaihuan Liang et al.
NEURIPS 2025
2
citations
MPCache: MPC-Friendly KV Cache Eviction for Efficient Private LLM Inference
Wenxuan Zeng, Ye Dong, Jinjin Zhou et al.
NEURIPS 2025arXiv:2501.06807
2
citations
Training-Free Long-Context Scaling of Large Language Models
Chenxin An, Fei Huang, Jun Zhang et al.
ICML 2024arXiv:2402.17463
60
citations