77
1

On overcoming the Curse of Dimensionality in Neural Networks

Abstract

Let AA be a set, VV a Hilbert space. Let HH be a Hilbert space of functions f:AVf:A\to V such that we have supxAf(x)VMfH\sup_{x\in A}\Vert f(x)\Vert_{V}\leq M \Vert f\Vert_H. For i=1,,ni=1,\cdots,n, let (xi,yi)A×V(x_i,y_i)\in A\times V comprise our dataset. Let fHf^*\in H be the unique global minimizer of the functional \begin{equation*} u(f) = \frac{\lambda}{2}\Vert f\Vert_{H}^{2} +\frac{1}{2} \frac{1}{n}\sum_{i=1}^{n}\Vert f(x_i)-y_i\Vert_{V}^{2}. \end{equation*} In this paper we show that for each kNk\in\mathbb{N} there exists a two layer network where the first layer has kk number of basis functions Φϵij,xij\Phi_{\epsilon_{i_j},x_{i_j}} for i1,,ik{1,,n}i_1,\cdots,i_k\in\{1,\cdots,n\} and the second layer takes a weighted summation of the first layer, such that the functions fkf_k realized by these networks satisfy \begin{equation*} E\left[ \Vert F_{k}-f^*\Vert_{H}^{2} \right] \leq \bigl(o(1) + \frac{C}{\lambda^2} (\frac{1}{\lambda}+M^2)u(f^*) \bigr) \frac{1}{k} . \end{equation*} Thus the xix_i do not need to be in a linear space and yiy_i are in a possibly infinite dimensional Hilbert space. The error rate is independent of the dimension of VV and data size nn.

View on arXiv
Comments on this paper