"stochastic gradient descent" Papers

45 papers found

Approximating Metric Magnitude of Point Sets

Rayna Andreeva, James Ward, Primoz Skraba et al.

AAAI 2025paperarXiv:2409.04411
3
citations

Controlling the Flow: Stability and Convergence for Stochastic Gradient Descent with Decaying Regularization

Sebastian Kassing, Simon Weissmann, Leif Döring

NEURIPS 2025arXiv:2505.11434
2
citations

Convergence of Clipped SGD on Convex $(L_0,L_1)$-Smooth Functions

Ofir Gaash, Kfir Y. Levy, Yair Carmon

NEURIPS 2025arXiv:2502.16492
5
citations

Descent with Misaligned Gradients and Applications to Hidden Convexity

Aditya Bhaskara, Ashok Cutkosky, Ravi Kumar et al.

ICLR 2025

Emergence and scaling laws in SGD learning of shallow neural networks

Yunwei Ren, Eshaan Nichani, Denny Wu et al.

NEURIPS 2025arXiv:2504.19983
17
citations

Gaussian Approximation and Concentration of Constant Learning-Rate Stochastic Gradient Descent

Ziyang Wei, Jiaqi Li, Zhipeng Lou et al.

NEURIPS 2025

Gradient correlation is a key ingredient to accelerate SGD with momentum

Julien Hermant, Marien Renaud, Jean-François Aujol et al.

ICLR 2025arXiv:2410.07870
3
citations

Leveraging Flatness to Improve Information-Theoretic Generalization Bounds for SGD

Ze Peng, Jian Zhang, Yisen Wang et al.

ICLR 2025arXiv:2601.01465

Neural Thermodynamics: Entropic Forces in Deep and Universal Representation Learning

Liu Ziyin, Yizhou Xu, Isaac Chuang

NEURIPS 2025arXiv:2505.12387
5
citations

Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson–Romberg Extrapolation

Marina Sheshukova, Denis Belomestny, Alain Oliviero Durmus et al.

ICLR 2025arXiv:2410.05106
4
citations

Online robust locally differentially private learning for nonparametric regression

Chenfei Gu, Qiangqiang Zhang, Ting Li et al.

NEURIPS 2025

Online Statistical Inference in Decision Making with Matrix Context

Qiyu Han, Will Wei Sun, Yichen Zhang

NEURIPS 2025arXiv:2212.11385
2
citations

On the Convergence of Stochastic Smoothed Multi-Level Compositional Gradient Descent Ascent

Xinwen Zhang, Hongchang Gao

NEURIPS 2025

Optimal Rates in Continual Linear Regression via Increasing Regularization

Ran Levinstein, Amit Attia, Matan Schliserman et al.

NEURIPS 2025arXiv:2506.06501
2
citations

Revisiting Large-Scale Non-convex Distributionally Robust Optimization

Qi Zhang, Yi Zhou, Simon Khan et al.

ICLR 2025
1
citations

Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations

Shaocong Ma, Heng Huang

ICLR 2025arXiv:2510.19975
12
citations

Stochastic Semi-Gradient Descent for Learning Mean Field Games with Population-Aware Function Approximation

Chenyu Zhang, Xu Chen, Xuan Di

ICLR 2025arXiv:2408.08192
7
citations

The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise

Shuze Daniel Liu, Shuhang Chen, Shangtong Zhang

NEURIPS 2025oralarXiv:2401.07844
13
citations

Understanding the Generalization of Stochastic Gradient Adam in Learning Neural Networks

Xuan Tang, Han Zhang, Yuan Cao et al.

NEURIPS 2025arXiv:2510.11354

A Doubly Recursive Stochastic Compositional Gradient Descent Method for Federated Multi-Level Compositional Optimization

Hongchang Gao

ICML 2024

Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks

Amit Peleg, Matthias Hein

ICML 2024arXiv:2407.03848

Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning

Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan et al.

ICML 2024arXiv:2306.04815
25
citations

Delving into the Convergence of Generalized Smooth Minimax Optimization

Wenhan Xian, Ziyi Chen, Heng Huang

ICML 2024

Demystifying SGD with Doubly Stochastic Gradients

Kyurae Kim, Joohwan Ko, Yian Ma et al.

ICML 2024arXiv:2406.00920
2
citations

Double Stochasticity Gazes Faster: Snap-Shot Decentralized Stochastic Gradient Tracking Methods

Hao Di, Haishan Ye, Xiangyu Chang et al.

ICML 2024

Efficient Online Set-valued Classification with Bandit Feedback

Zhou Wang, Xingye Qiao

ICML 2024arXiv:2405.04393
1
citations

Generalization Analysis of Stochastic Weight Averaging with General Sampling

Wang Peng, Li Shen, Zerui Tao et al.

ICML 2024

How Private are DP-SGD Implementations?

Lynn Chua, Badih Ghazi, Pritish Kamath et al.

ICML 2024arXiv:2403.17673
22
citations

Implicit Compressibility of Overparametrized Neural Networks Trained with Heavy-Tailed SGD

Yijun Wan, Melih Barsbey, Abdellatif Zaidi et al.

ICML 2024arXiv:2306.08125
5
citations

Improved Stability and Generalization Guarantees of the Decentralized SGD Algorithm

Batiste Le Bars, Aurélien Bellet, Marc Tommasi et al.

ICML 2024arXiv:2306.02939
11
citations

LPGD: A General Framework for Backpropagation through Embedded Optimization Layers

Anselm Paulus, Georg Martius, Vit Musil

ICML 2024arXiv:2407.05920
4
citations

MoMo: Momentum Models for Adaptive Learning Rates

Fabian Schaipp, Ruben Ohana, Michael Eickenberg et al.

ICML 2024arXiv:2305.07583
20
citations

On Convergence of Incremental Gradient for Non-convex Smooth Functions

Anastasiia Koloskova, Nikita Doikov, Sebastian Stich et al.

ICML 2024arXiv:2305.19259
6
citations

Online Learning and Information Exponents: The Importance of Batch size & Time/Complexity Tradeoffs

Luca Arnaboldi, Yatin Dandi, FLORENT KRZAKALA et al.

ICML 2024

On the Generalization of Stochastic Gradient Descent with Momentum

Ali Ramezani-Kebrya, Kimon Antonakopoulos, Volkan Cevher et al.

ICML 2024arXiv:1809.04564
36
citations

Plug-and-Play image restoration with Stochastic deNOising REgularization

Marien Renaud, Jean Prost, Arthur Leclaire et al.

ICML 2024arXiv:2402.01779
17
citations

Random features models: a way to study the success of naive imputation

Alexis Ayme, Claire Boyer, Aymeric Dieuleveut et al.

ICML 2024arXiv:2402.03839
7
citations

Random Scaling and Momentum for Non-smooth Non-convex Optimization

Qinzi Zhang, Ashok Cutkosky

ICML 2024arXiv:2405.09742
8
citations

Sliding Down the Stairs: How Correlated Latent Variables Accelerate Learning with Neural Networks

Lorenzo Bardone, Sebastian Goldt

ICML 2024arXiv:2404.08602
11
citations

Sparse Variational Student-t Processes

Jian Xu, Delu Zeng

AAAI 2024paperarXiv:2312.05568
7
citations

Stability and Generalization of Stochastic Compositional Gradient Descent Algorithms

Ming Yang, Xiyuan Wei, Tianbao Yang et al.

ICML 2024arXiv:2307.03357
3
citations

The Role of Learning Algorithms in Collective Action

Omri Ben-Dov, Jake Fawkes, Samira Samadi et al.

ICML 2024arXiv:2405.06582
10
citations

Tuning-Free Stochastic Optimization

Ahmed Khaled, Chi Jin

ICML 2024spotlightarXiv:2402.07793
13
citations

Understanding Forgetting in Continual Learning with Linear Regression

Meng Ding, Kaiyi Ji, Di Wang et al.

ICML 2024arXiv:2405.17583
18
citations

What is the Long-Run Distribution of Stochastic Gradient Descent? A Large Deviations Analysis

Waïss Azizian, Franck Iutzeler, Jérôme Malick et al.

ICML 2024arXiv:2406.09241
14
citations