Minimum Width for Deep, Narrow MLP: A Diffeomorphism Approach

3citations

arXiv:2308.15873

citations

#1580

in NEURIPS 2025

of 5858 papers

Top Authors

Data Points

Top Authors

Geonho Hwang

Abstract

Recently, there has been a growing focus on determining the minimum width requirements for achieving the universal approximation property in deep, narrow Multi-Layer Perceptrons (MLPs). Among these challenges, one particularly challenging task is approximating a continuous function under the uniform norm, as indicated by the significant disparity between its lower and upper bounds. To address this problem, we propose a framework that simplifies finding the minimum width for deep, narrow MLPs into determining a purely geometrical function denoted as $w(d_x, d_y)$. This function relies solely on the input and output dimensions, represented as $d_x$ and $d_y$, respectively. To achieve this, we first demonstrate that deep, narrow MLPs, when provided with a small additional width, can approximate any $C^2$-diffeomorphism. Subsequently, using this result, we prove that $w(d_x, d_y)$ equates to the optimal minimum width required for deep, narrow MLPs to achieve universality. By employing the aforementioned framework and the Whitney embedding theorem, we provide an upper bound for the minimum width, given by $\operatorname{max}(2d_x+1, d_y) + \alpha(\sigma)$, where $0 \leq \alpha(\sigma) \leq 2$ represents a constant depending explicitly on the activation function. Furthermore, we provide novel optimal values for the minimum width in several settings, including $w(2,2)=w(2,3) = 4$.

Citation History

Jan 25, 2026

Jan 26, 2026

Jan 28, 2026

Feb 13, 2026

3+3

Feb 13, 2026