Paper "efficient inference" Papers
4 papers found
Conference
CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing
Wenhao Zheng, Yixiao Chen, Weitong Zhang et al.
COLM 2025paperarXiv:2502.01976
24
citations
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference
Han Zhao, Min Zhang, Wei Zhao et al.
AAAI 2025paperarXiv:2403.14520
110
citations
EOV-Seg: Efficient Open-Vocabulary Panoptic Segmentation
Hongwei Niu, Jie Hu, Jianghang Lin et al.
AAAI 2025paperarXiv:2412.08628
6
citations
ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference
Ziqian Zeng, Yihuai Hong, Hongliang Dai et al.
AAAI 2024paperarXiv:2312.11882
17
citations