ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.12840
24
4

Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials

24 July 2023
Ilias Diakonikolas
D. Kane
ArXivPDFHTML
Abstract

We study the problem of PAC learning a linear combination of kkk ReLU activations under the standard Gaussian distribution on Rd\mathbb{R}^dRd with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity (dk/ϵ)O(k)(dk/\epsilon)^{O(k)}(dk/ϵ)O(k), where ϵ>0\epsilon>0ϵ>0 is the target accuracy. Prior work had given an algorithm for this problem with complexity (dk/ϵ)h(k)(dk/\epsilon)^{h(k)}(dk/ϵ)h(k), where the function h(k)h(k)h(k) scales super-polynomially in kkk. Interestingly, the complexity of our algorithm is near-optimal within the class of Correlational Statistical Query algorithms. At a high-level, our algorithm uses tensor decomposition to identify a subspace such that all the O(k)O(k)O(k)-order moments are small in the orthogonal directions. Its analysis makes essential use of the theory of Schur polynomials to show that the higher-moment error tensors are small given that the lower-order ones are.

View on arXiv
Comments on this paper