On overcoming the Curse of Dimensionality in Neural Networks

2 September 2018

Abstract

Let $A$ be a set, $V$ a Hilbert space. Let $H$ be a Hilbert space of functions $f:A\to V$ such that we have $\sup_{x\in A}\Vert f(x)\Vert_{V}\leq M \Vert f\Vert_H$ . For $i=1,\cdots,n$ , let $(x_i,y_i)\in A\times V$ comprise our dataset. Let $f^*\in H$ be the unique global minimizer of the functional \begin{equation*} u(f) = \frac{\lambda}{2}\Vert f\Vert_{H}^{2} +\frac{1}{2} \frac{1}{n}\sum_{i=1}^{n}\Vert f(x_i)-y_i\Vert_{V}^{2}. \end{equation*} In this paper we show that for each $k\in\mathbb{N}$ there exists a two layer network where the first layer has $k$ number of basis functions $\Phi_{\epsilon_{i_j},x_{i_j}}$ for $i_1,\cdots,i_k\in\{1,\cdots,n\}$ and the second layer takes a weighted summation of the first layer, such that the functions $f_k$ realized by these networks satisfy \begin{equation*} E\left[ \Vert F_{k}-f^*\Vert_{H}^{2} \right] \leq \bigl(o(1) + \frac{C}{\lambda^2} (\frac{1}{\lambda}+M^2)u(f^*) \bigr) \frac{1}{k} . \end{equation*} Thus the $x_i$ do not need to be in a linear space and $y_i$ are in a possibly infinite dimensional Hilbert space. The error rate is independent of the dimension of $V$ and data size $n$ .

View on arXiv

Comments on this paper