"optimizer convergence analysis" Papers
2 papers found
Conference
Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Zeman Li, Xinwei Zhang, Peilin Zhong et al.
ICLR 2025arXiv:2410.06441
11
citations
ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-Order Optimization
Shuoran Jiang, Qingcai Chen, Yang Xiang et al.
AAAI 2024paperarXiv:2312.15184
21
citations