"adaptive inference" Papers
5 papers found
Conference
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning
Yiwu Zhong, Zhuoming Liu, Yin Li et al.
ICCV 2025arXiv:2412.03248
24
citations
Beyond Greedy Exits: Improved Early Exit Decisions for Risk Control and Reliability
Divya Jyoti Bajpai, Manjesh Kumar Hanawal
NEURIPS 2025arXiv:2509.23666
1
citations
Dynamic Diffusion Transformer
Wangbo Zhao, Yizeng Han, Jiasheng Tang et al.
ICLR 2025arXiv:2410.03456
38
citations
Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs
Hao Kang, Qingru Zhang, Han Cai et al.
NEURIPS 2025spotlightarXiv:2505.19481
6
citations
Flextron: Many-in-One Flexible Large Language Model
Ruisi Cai, Saurav Muralidharan, Greg Heinrich et al.
ICML 2024arXiv:2406.10260
34
citations