261
v1v2v3v4v5v6 (latest)

Deep Neural Networks with ReLU-Sine-Exponential Activations Break Curse of Dimensionality in Approximation on Hölder Class

SIAM Journal on Mathematical Analysis (SIAM J. Math. Anal.), 2021
Abstract

In this paper, we construct neural networks with ReLU, sine and 2x2^x as activation functions. For general continuous ff defined on [0,1]d[0,1]^d with continuity modulus ωf()\omega_f(\cdot), we construct ReLU-sine-2x2^x networks that enjoy an approximation rate O(ωf(d)2M+ωf(dN))\mathcal{O}(\omega_f(\sqrt{d})\cdot2^{-M}+\omega_{f}\left(\frac{\sqrt{d}}{N}\right)), where M,NN+M,N\in \mathbb{N}^{+} denote the hyperparameters related to widths of the networks. As a consequence, we can construct ReLU-sine-2x2^x network with the depth 55 and width max{2d3/2(3μϵ)1/α,2log23μdα/22ϵ+2}\max\left\{\left\lceil2d^{3/2}\left(\frac{3\mu}{\epsilon}\right)^{1/{\alpha}}\right\rceil,2\left\lceil\log_2\frac{3\mu d^{\alpha/2}}{2\epsilon}\right\rceil+2\right\} that approximates fHμα([0,1]d)f\in \mathcal{H}_{\mu}^{\alpha}([0,1]^d) within a given tolerance ϵ>0\epsilon >0 measured in LpL^p norm p[1,)p\in[1,\infty), where Hμα([0,1]d)\mathcal{H}_{\mu}^{\alpha}([0,1]^d) denotes the H\"older continuous function class defined on [0,1]d[0,1]^d with order α(0,1]\alpha \in (0,1] and constant μ>0\mu > 0. Therefore, the ReLU-sine-2x2^x networks overcome the curse of dimensionality on Hμα([0,1]d)\mathcal{H}_{\mu}^{\alpha}([0,1]^d). In addition to its supper expressive power, functions implemented by ReLU-sine-2x2^x networks are (generalized) differentiable, enabling us to apply SGD to train.

View on arXiv
Comments on this paper