"mixture-of-experts models" Papers

11 papers found

Advancing Expert Specialization for Better MoE

Hongcan Guo, Haolang Lu, Guoshun Nan et al.

NEURIPS 2025oralarXiv:2505.22323
10
citations

DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models

Yongqi Huang, Peng Ye, Chenyu Huang et al.

CVPR 2025arXiv:2503.01359
6
citations

Discovering Important Experts for Mixture-of-Experts Models Pruning Through a Theoretical Perspective

Weizhong Huang, Yuxin Zhang, Xiawu Zheng et al.

NEURIPS 2025

Enhanced Expert Merging for Mixture-of-Experts in Graph Foundation Models

Lei Liu, Xingyu Xia, Qianqian Xie et al.

NEURIPS 2025

Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models

Keisuke Kamahori, Tian Tang, Yile Gu et al.

ICLR 2025arXiv:2402.07033
48
citations

Mixture Compressor for Mixture-of-Experts LLMs Gains More

Wei Huang, Yue Liao, Jianhui Liu et al.

ICLR 2025arXiv:2410.06270
24
citations

MoEMeta: Mixture-of-Experts Meta Learning for Few-Shot Relational Learning

Han Wu, Jie Yin

NEURIPS 2025arXiv:2510.23013
1
citations

ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing

Ziteng Wang, Jun Zhu, Jianfei Chen

ICLR 2025arXiv:2412.14711
31
citations

The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs

HONG LI, Nanxi Li, Yuanjie Chen et al.

ICLR 2025arXiv:2410.01417
3
citations

OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models

Fuzhao Xue, Zian Zheng, Yao Fu et al.

ICML 2024arXiv:2402.01739
160
citations

Scaling Beyond the GPU Memory Limit for Large Mixture-of-Experts Model Training

Yechan Kim, Hwijoon Lim, Dongsu Han

ICML 2024