ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07038
14
24

The Landscape of Deep Learning Algorithms

19 May 2017
Pan Zhou
Jiashi Feng
ArXivPDFHTML
Abstract

This paper studies the landscape of empirical risk of deep neural networks by theoretically analyzing its convergence behavior to the population risk as well as its stationary points and properties. For an lll-layer linear neural network, we prove its empirical risk uniformly converges to its population risk at the rate of O(r2ldlog⁡(l)/n)\mathcal{O}(r^{2l}\sqrt{d\log(l)}/\sqrt{n})O(r2ldlog(l)​/n​) with training sample size of nnn, the total weight dimension of ddd and the magnitude bound rrr of weight of each layer. We then derive the stability and generalization bounds for the empirical risk based on this result. Besides, we establish the uniform convergence of gradient of the empirical risk to its population counterpart. We prove the one-to-one correspondence of the non-degenerate stationary points between the empirical and population risks with convergence guarantees, which describes the landscape of deep neural networks. In addition, we analyze these properties for deep nonlinear neural networks with sigmoid activation functions. We prove similar results for convergence behavior of their empirical risks as well as the gradients and analyze properties of their non-degenerate stationary points. To our best knowledge, this work is the first one theoretically characterizing landscapes of deep learning algorithms. Besides, our results provide the sample complexity of training a good deep neural network. We also provide theoretical understanding on how the neural network depth lll, the layer width, the network size ddd and parameter magnitude determine the neural network landscapes.

View on arXiv
Comments on this paper