Poster "stochastic gradient descent" Papers
41 papers found
Conference
Controlling the Flow: Stability and Convergence for Stochastic Gradient Descent with Decaying Regularization
Sebastian Kassing, Simon Weissmann, Leif Döring
Convergence of Clipped SGD on Convex $(L_0,L_1)$-Smooth Functions
Ofir Gaash, Kfir Y. Levy, Yair Carmon
Descent with Misaligned Gradients and Applications to Hidden Convexity
Aditya Bhaskara, Ashok Cutkosky, Ravi Kumar et al.
Emergence and scaling laws in SGD learning of shallow neural networks
Yunwei Ren, Eshaan Nichani, Denny Wu et al.
Gaussian Approximation and Concentration of Constant Learning-Rate Stochastic Gradient Descent
Ziyang Wei, Jiaqi Li, Zhipeng Lou et al.
Gradient correlation is a key ingredient to accelerate SGD with momentum
Julien Hermant, Marien Renaud, Jean-François Aujol et al.
Leveraging Flatness to Improve Information-Theoretic Generalization Bounds for SGD
Ze Peng, Jian Zhang, Yisen Wang et al.
Neural Thermodynamics: Entropic Forces in Deep and Universal Representation Learning
Liu Ziyin, Yizhou Xu, Isaac Chuang
Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson–Romberg Extrapolation
Marina Sheshukova, Denis Belomestny, Alain Oliviero Durmus et al.
Online robust locally differentially private learning for nonparametric regression
Chenfei Gu, Qiangqiang Zhang, Ting Li et al.
Online Statistical Inference in Decision Making with Matrix Context
Qiyu Han, Will Wei Sun, Yichen Zhang
On the Convergence of Stochastic Smoothed Multi-Level Compositional Gradient Descent Ascent
Xinwen Zhang, Hongchang Gao
Optimal Rates in Continual Linear Regression via Increasing Regularization
Ran Levinstein, Amit Attia, Matan Schliserman et al.
Revisiting Large-Scale Non-convex Distributionally Robust Optimization
Qi Zhang, Yi Zhou, Simon Khan et al.
Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations
Shaocong Ma, Heng Huang
Stochastic Semi-Gradient Descent for Learning Mean Field Games with Population-Aware Function Approximation
Chenyu Zhang, Xu Chen, Xuan Di
Understanding the Generalization of Stochastic Gradient Adam in Learning Neural Networks
Xuan Tang, Han Zhang, Yuan Cao et al.
A Doubly Recursive Stochastic Compositional Gradient Descent Method for Federated Multi-Level Compositional Optimization
Hongchang Gao
Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks
Amit Peleg, Matthias Hein
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan et al.
Delving into the Convergence of Generalized Smooth Minimax Optimization
Wenhan Xian, Ziyi Chen, Heng Huang
Demystifying SGD with Doubly Stochastic Gradients
Kyurae Kim, Joohwan Ko, Yian Ma et al.
Double Stochasticity Gazes Faster: Snap-Shot Decentralized Stochastic Gradient Tracking Methods
Hao Di, Haishan Ye, Xiangyu Chang et al.
Efficient Online Set-valued Classification with Bandit Feedback
Zhou Wang, Xingye Qiao
Generalization Analysis of Stochastic Weight Averaging with General Sampling
Wang Peng, Li Shen, Zerui Tao et al.
How Private are DP-SGD Implementations?
Lynn Chua, Badih Ghazi, Pritish Kamath et al.
Implicit Compressibility of Overparametrized Neural Networks Trained with Heavy-Tailed SGD
Yijun Wan, Melih Barsbey, Abdellatif Zaidi et al.
Improved Stability and Generalization Guarantees of the Decentralized SGD Algorithm
Batiste Le Bars, Aurélien Bellet, Marc Tommasi et al.
LPGD: A General Framework for Backpropagation through Embedded Optimization Layers
Anselm Paulus, Georg Martius, Vit Musil
MoMo: Momentum Models for Adaptive Learning Rates
Fabian Schaipp, Ruben Ohana, Michael Eickenberg et al.
On Convergence of Incremental Gradient for Non-convex Smooth Functions
Anastasiia Koloskova, Nikita Doikov, Sebastian Stich et al.
Online Learning and Information Exponents: The Importance of Batch size & Time/Complexity Tradeoffs
Luca Arnaboldi, Yatin Dandi, FLORENT KRZAKALA et al.
On the Generalization of Stochastic Gradient Descent with Momentum
Ali Ramezani-Kebrya, Kimon Antonakopoulos, Volkan Cevher et al.
Plug-and-Play image restoration with Stochastic deNOising REgularization
Marien Renaud, Jean Prost, Arthur Leclaire et al.
Random features models: a way to study the success of naive imputation
Alexis Ayme, Claire Boyer, Aymeric Dieuleveut et al.
Random Scaling and Momentum for Non-smooth Non-convex Optimization
Qinzi Zhang, Ashok Cutkosky
Sliding Down the Stairs: How Correlated Latent Variables Accelerate Learning with Neural Networks
Lorenzo Bardone, Sebastian Goldt
Stability and Generalization of Stochastic Compositional Gradient Descent Algorithms
Ming Yang, Xiyuan Wei, Tianbao Yang et al.
The Role of Learning Algorithms in Collective Action
Omri Ben-Dov, Jake Fawkes, Samira Samadi et al.
Understanding Forgetting in Continual Learning with Linear Regression
Meng Ding, Kaiyi Ji, Di Wang et al.
What is the Long-Run Distribution of Stochastic Gradient Descent? A Large Deviations Analysis
Waïss Azizian, Franck Iutzeler, Jérôme Malick et al.