18
182

Deep Network Approximation Characterized by Number of Neurons

Abstract

This paper quantitatively characterizes the approximation power of deep feed-forward neural networks (FNNs) in terms of the number of neurons. It is shown by construction that ReLU FNNs with width O(max{dN1/d,N+1})\mathcal{O}\big(\max\{d\lfloor N^{1/d}\rfloor,\, N+1\}\big) and depth O(L)\mathcal{O}(L) can approximate an arbitrary H\"older continuous function of order α(0,1]\alpha\in (0,1] on [0,1]d[0,1]^d with a nearly tight approximation rate O(dN2α/dL2α/d)\mathcal{O}\big(\sqrt{d} N^{-2\alpha/d}L^{-2\alpha/d}\big) measured in LpL^p-norm for any N,LN+N,L\in \mathbb{N}^+ and p[1,]p\in[1,\infty]. More generally for an arbitrary continuous function ff on [0,1]d[0,1]^d with a modulus of continuity ωf()\omega_f(\cdot), the constructive approximation rate is O(dωf(N2/dL2/d))\mathcal{O}\big(\sqrt{d}\,\omega_f( N^{-2/d}L^{-2/d})\big). We also extend our analysis to ff on irregular domains or those localized in an ε\varepsilon-neighborhood of a dMd_{\mathcal{M}}-dimensional smooth manifold M[0,1]d\mathcal{M}\subseteq [0,1]^d with dMdd_{\mathcal{M}}\ll d. Especially, in the case of an essentially low-dimensional domain, we show an approximation rate O(ωf(ε1δddδ+ε)+dωf(d(1δ)dδN2/dδL2/dδ))\mathcal{O}\big(\omega_f(\tfrac{\varepsilon}{1-\delta}\sqrt{\tfrac{d}{d_\delta}}+\varepsilon)+\sqrt{d}\,\omega_f(\tfrac{\sqrt{d}}{(1-\delta)\sqrt{d_\delta}}N^{-2/d_\delta}L^{-2/d_\delta})\big) for ReLU FNNs to approximate ff in the ε\varepsilon-neighborhood, where dδ=O(dMln(d/δ)δ2)d_\delta=\mathcal{O}\big(d_{\mathcal{M}}\tfrac{\ln (d/\delta)}{\delta^2}\big) for any δ(0,1)\delta\in(0,1) as a relative error for a projection to approximate an isometry when projecting M\mathcal{M} to a dδd_{\delta}-dimensional domain.

View on arXiv
Comments on this paper