ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.14075
19
115

Neural Network Approximation: Three Hidden Layers Are Enough

25 October 2020
Zuowei Shen
Haizhao Yang
Shijun Zhang
ArXivPDFHTML
Abstract

A three-hidden-layer neural network with super approximation power is introduced. This network is built with the floor function (⌊x⌋\lfloor x\rfloor⌊x⌋), the exponential function (2x2^x2x), the step function (1x≥01_{x\geq 0}1x≥0​), or their compositions as the activation function in each neuron and hence we call such networks as Floor-Exponential-Step (FLES) networks. For any width hyper-parameter N∈N+N\in\mathbb{N}^+N∈N+, it is shown that FLES networks with width max⁡{d,N}\max\{d,N\}max{d,N} and three hidden layers can uniformly approximate a H\"older continuous function fff on [0,1]d[0,1]^d[0,1]d with an exponential approximation rate 3λ(2d)α2−αN3\lambda (2\sqrt{d})^{\alpha} 2^{-\alpha N}3λ(2d​)α2−αN, where α∈(0,1]\alpha \in(0,1]α∈(0,1] and λ>0\lambda>0λ>0 are the H\"older order and constant, respectively. More generally for an arbitrary continuous function fff on [0,1]d[0,1]^d[0,1]d with a modulus of continuity ωf(⋅)\omega_f(\cdot)ωf​(⋅), the constructive approximation rate is 2ωf(2d)2−N+ωf(2d 2−N)2\omega_f(2\sqrt{d}){2^{-N}}+\omega_f(2\sqrt{d}\,2^{-N})2ωf​(2d​)2−N+ωf​(2d​2−N). Moreover, we extend such a result to general bounded continuous functions on a bounded set E⊆RdE\subseteq\mathbb{R}^dE⊆Rd. As a consequence, this new class of networks overcomes the curse of dimensionality in approximation power when the variation of ωf(r)\omega_f(r)ωf​(r) as r→0r\rightarrow 0r→0 is moderate (e.g., ωf(r)≲rα\omega_f(r)\lesssim r^\alphaωf​(r)≲rα for H\"older continuous functions), since the major term to be concerned in our approximation rate is essentially d\sqrt{d}d​ times a function of NNN independent of ddd within the modulus of continuity. Finally, we extend our analysis to derive similar approximation results in the LpL^pLp-norm for p∈[1,∞)p\in[1,\infty)p∈[1,∞) via replacing Floor-Exponential-Step activation functions by continuous activation functions.

View on arXiv
Comments on this paper