"mixture-of-experts architecture" Papers
7 papers found
Conference
Layerwise Recurrent Router for Mixture-of-Experts
Zihan Qiu, Zeyu Huang, Shuang Cheng et al.
ICLR 2025arXiv:2408.06793
8
citations
MEGADance: Mixture-of-Experts Architecture for Genre-Aware 3D Dance Generation
kaixing yang, Xulong Tang, Ziqiao Peng et al.
NEURIPS 2025arXiv:2505.17543
5
citations
MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models
Jingwei Xu, Junyu Lai, Yunpeng Huang
ICLR 2025arXiv:2405.13053
13
citations
MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging
Zihuan Qiu, Yi Xu, Chiyuan He et al.
NEURIPS 2025arXiv:2505.11883
5
citations
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi, Clara Mohri, David Brandfonbrener et al.
ICLR 2025arXiv:2410.19034
14
citations
Mozart: Modularized and Efficient MoE Training on 3.5D Wafer-Scale Chiplet Architectures
Shuqing Luo, Ye Han, Pingzhi Li et al.
NEURIPS 2025spotlight
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics
Christoph Jürgen Hemmer, Daniel Durstewitz
NEURIPS 2025oralarXiv:2505.13192
8
citations