"pipeline parallelism" Papers
7 papers found
Conference
DynaPipe: Dynamic Layer Redistribution for Efficient Serving of LLMs with Pipeline Parallelism
HongXin Xu, Tianyu Guo, Xianwei Zhang
NEURIPS 2025
Efficient Long Context Fine-tuning with Chunk Flow
Xiulong Yuan, Hongtao Xu, Wenting Shen et al.
ICML 2025arXiv:2503.02356
3
citations
Nesterov Method for Asynchronous Pipeline Parallel Optimization
Thalaiyasingam Ajanthan, Sameera Ramasinghe, Yan Zuo et al.
ICML 2025arXiv:2505.01099
2
citations
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism
Yanxi Chen, Xuchen Pan, Yaliang Li et al.
ICML 2024arXiv:2312.04916
60
citations
HexGen: Generative Inference of Large Language Model over Heterogeneous Environment
Youhe Jiang, Ran Yan, Xiaozhe Yao et al.
ICML 2024arXiv:2311.11514
34
citations
Position: Exploring the Robustness of Pipeline-Parallelism-Based Decentralized Training
Lin Lu, Chenxi Dai, Wangcheng Tao et al.
ICML 2024
Practical Performance Guarantees for Pipelined DNN Inference
Aaron Archer, Matthew Fahrbach, Kuikui Liu et al.
ICML 2024spotlightarXiv:2311.03703
1
citations