α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Sebastian Jaszczur
Sebastian Jaszczur
4
papers
270
total citations
papers (4)
Sparse is Enough in Scaling Transformers
NEURIPS 2021
arXiv
120
citations
Scaling Laws for Fine-Grained Mixture of Experts
ICML 2024
arXiv
120
citations
Structured Packing in LLM Training Improves Long Context Utilization
AAAI 2025
arXiv
16
citations
Joint MoE Scaling Laws: Mixture of Experts Can Be Memory Efficient
ICML 2025
arXiv
14
citations