Stochastic Modified Equations and Dynamics of Dropout Algorithm

11citations
arXiv:2305.15850
11
citations
#1269
in ICLR 2024
of 2297 papers
4
Top Authors
4
Data Points

Abstract

Dropout is a widely utilized regularization technique in the training of neural networks, nevertheless, its underlying mechanism and impact on achieving good generalization abilities remain to be further understood. In this work, we start by undertaking a rigorous theoretical derivation of the stochastic modified equations, with the primary aim of providing an effective approximation for the discrete iterative process of dropout. Meanwhile, we experimentally verify SDE's ability to approximate dropout under a wider range of settings. Subsequently, we empirically delve into the intricate mechanisms by which dropout facilitates the identification of flatter minima. This exploration is conducted through intuitive approximations, exploiting the structural analogies inherent in the Hessian of loss landscape and the covariance of dropout. Our empirical findings substantiate the ubiquitous presence of the Hessian-variance alignment relation throughout the training process of dropout.

Citation History

Jan 28, 2026
0
Feb 13, 2026
11+11
Feb 13, 2026
11
Feb 13, 2026
11