ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.07863
14
11

Learning ReLU Networks via Alternating Minimization

20 June 2018
Gauri Jagatap
C. Hegde
ArXivPDFHTML
Abstract

We propose and analyze a new family of algorithms for training neural networks with ReLU activations. Our algorithms are based on the technique of alternating minimization: estimating the activation patterns of each ReLU for all given samples, interleaved with weight updates via a least-squares step. The main focus of our paper are 1-hidden layer networks with kkk hidden neurons and ReLU activation. We show that under standard distributional assumptions on the d−d-d−dimensional input data, our algorithm provably recovers the true `ground truth' parameters in a linearly convergent fashion. This holds as long as the weights are sufficiently well initialized; furthermore, our method requires only n=O~(dk2)n=\widetilde{O}(dk^2)n=O(dk2) samples. We also analyze the special case of 1-hidden layer networks with skipped connections, commonly used in ResNet-type architectures, and propose a novel initialization strategy for the same. For ReLU based ResNet type networks, we provide the first linear convergence guarantee with an end-to-end algorithm. We also extend this framework to deeper networks and empirically demonstrate its convergence to a global minimum.

View on arXiv
Comments on this paper