"mixture-of-experts architectures" Papers
3 papers found
Conference
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Zijia Zhao, Longteng Guo, Jie Cheng et al.
ICLR 2025arXiv:2410.10456
8
citations
Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training
Mengru Wang, Xingyu Chen, Yue Wang et al.
NEURIPS 2025arXiv:2505.14681
10
citations
Who Speaks for the Trigger? Dynamic Expert Routing in Backdoored Mixture-of-Experts Transformers
Xin Zhao, Xiaojun Chen, Bingshan Liu et al.
NEURIPS 2025arXiv:2510.13462