17
0

Depth and Feature Learning are Provably Beneficial for Neural Network Discriminators

Carles Domingo-Enrich
Abstract

We construct pairs of distributions μd,νd\mu_d, \nu_d on Rd\mathbb{R}^d such that the quantity Exμd[F(x)]Exνd[F(x)]|\mathbb{E}_{x \sim \mu_d} [F(x)] - \mathbb{E}_{x \sim \nu_d} [F(x)]| decreases as Ω(1/d2)\Omega(1/d^2) for some three-layer ReLU network FF with polynomial width and weights, while declining exponentially in dd if FF is any two-layer network with polynomial weights. This shows that deep GAN discriminators are able to distinguish distributions that shallow discriminators cannot. Analogously, we build pairs of distributions μd,νd\mu_d, \nu_d on Rd\mathbb{R}^d such that Exμd[F(x)]Exνd[F(x)]|\mathbb{E}_{x \sim \mu_d} [F(x)] - \mathbb{E}_{x \sim \nu_d} [F(x)]| decreases as Ω(1/(dlogd))\Omega(1/(d\log d)) for two-layer ReLU networks with polynomial weights, while declining exponentially for bounded-norm functions in the associated RKHS. This confirms that feature learning is beneficial for discriminators. Our bounds are based on Fourier transforms.

View on arXiv
Comments on this paper