"sparse activation" Papers
5 papers found
Conference
MoFRR: Mixture of Diffusion Models for Face Retouching Restoration
Jiaxin Liu, Qichao Ying, Zhenxing Qian et al.
ICCV 2025arXiv:2507.19770
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
Ziteng Wang, Jun Zhu, Jianfei Chen
ICLR 2025arXiv:2412.14711
31
citations
SMoSE: Sparse Mixture of Shallow Experts for Interpretable Reinforcement Learning in Continuous Control Tasks
Mátyás Vincze, Laura Ferrarotti, Leonardo Lucio Custode et al.
AAAI 2025paperarXiv:2412.13053
2
citations
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts
Xiaoming Shi, Shiyu Wang, Yuqi Nie et al.
ICLR 2025arXiv:2409.16040
194
citations
Exploring the Benefit of Activation Sparsity in Pre-training
Zhengyan Zhang, Chaojun Xiao, Qiujieli Qin et al.
ICML 2024arXiv:2410.03440
6
citations