ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.00501
204
262
v1v2 (latest)

Learning One-hidden-layer Neural Networks with Landscape Design

1 November 2017
Rong Ge
Jason D. Lee
Tengyu Ma
    MLT
ArXiv (abs)PDFHTML
Abstract

We consider the problem of learning a one-hidden-layer neural network: we assume the input x∈Rdx\in \R^dx∈Rd is from Gaussian distribution and the label y=a⊤σ(Bx)+ξy = a^\top \sigma(Bx) + \xiy=a⊤σ(Bx)+ξ, where aaa is a nonnegative vector in Rm\R^mRm with m≤dm\le dm≤d, B∈Rm×dB\in \R^{m\times d}B∈Rm×d is a full-rank weight matrix, and ξ\xiξ is a noise vector. We first give an analytic formula for the population risk of the standard squared loss and demonstrate that it implicitly attempts to decompose a sequence of low-rank tensors simultaneously. Inspired by the formula, we design a non-convex objective function G(⋅)G(\cdot)G(⋅) whose landscape is guaranteed to have the following properties: 1. All local minima of GGG are also global minima. 2. All global minima of GGG correspond to the ground truth parameters. 3. The value and gradient of GGG can be estimated using samples. With these properties, stochastic gradient descent on GGG provably converges to the global minimum and learn the ground-truth parameters. We also prove finite sample complexity result and validate the results by simulations.

View on arXiv
Comments on this paper