"expert specialization" Papers
4 papers found
Conference
Advancing Expert Specialization for Better MoE
Hongcan Guo, Haolang Lu, Guoshun Nan et al.
NEURIPS 2025oralarXiv:2505.22323
10
citations
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
Taishi Nakamura, Takuya Akiba, Kazuki Fujii et al.
ICLR 2025arXiv:2502.19261
9
citations
Learning to Specialize: Joint Gating-Expert Training for Adaptive MoEs in Decentralized Settings
Yehya Farhat, Hamza ElMokhtar Shili, Fangshuo Liao et al.
NEURIPS 2025arXiv:2306.08586
3
citations
Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?
Huy Nguyen, Pedram Akbarian, Nhat Ho
ICML 2024arXiv:2401.13875
20
citations