ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.02501
13
80

A Unifying View on Implicit Bias in Training Linear Neural Networks

6 October 2020
Chulhee Yun
Shankar Krishnan
H. Mobahi
    MLT
ArXivPDFHTML
Abstract

We study the implicit bias of gradient flow (i.e., gradient descent with infinitesimal step size) on linear neural network training. We propose a tensor formulation of neural networks that includes fully-connected, diagonal, and convolutional networks as special cases, and investigate the linear version of the formulation called linear tensor networks. With this formulation, we can characterize the convergence direction of the network parameters as singular vectors of a tensor defined by the network. For LLL-layer linear tensor networks that are orthogonally decomposable, we show that gradient flow on separable classification finds a stationary point of the ℓ2/L\ell_{2/L}ℓ2/L​ max-margin problem in a "transformed" input space defined by the network. For underdetermined regression, we prove that gradient flow finds a global minimum which minimizes a norm-like function that interpolates between weighted ℓ1\ell_1ℓ1​ and ℓ2\ell_2ℓ2​ norms in the transformed input space. Our theorems subsume existing results in the literature while removing standard convergence assumptions. We also provide experiments that corroborate our analysis.

View on arXiv
Comments on this paper