v1v2 (latest)

Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence

13 November 2024

Papers citing "Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence"

29 / 29 papers shown

Title
Learning sum of diverse features: computational hardness and efficient gradient-based training for ridge combinations Kazusato Oko Yujin Song Taiji Suzuki Denny Wu MLT 62 9 0 17 Jun 2024
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit Jason D. Lee Kazusato Oko Taiji Suzuki Denny Wu MLT 141 25 0 03 Jun 2024
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions Luca Arnaboldi Yatin Dandi Florent Krzakala Luca Pesce Ludovic Stephan 119 18 0 24 May 2024
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents Yatin Dandi Emanuele Troiani Luca Arnaboldi Luca Pesce Lenka Zdeborová Florent Krzakala MLT 113 30 0 05 Feb 2024
On the Impact of Overparameterization on the Training of a Shallow Neural Network in High Dimensions Simon Martin Francis Bach Giulio Biroli 94 11 0 07 Nov 2023
Should Under-parameterized Student Networks Copy or Average Teacher Weights? Berfin Simsek Amire Bendjeddou W. Gerstner Johanni Brea 75 8 0 03 Nov 2023
On Learning Gaussian Multi-index Models with Gradient Flow A. Bietti Joan Bruna Loucas Pillaud-Vivien 58 37 0 30 Oct 2023
SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem Margalit Glasgow MLT 137 14 0 26 Sep 2023
On Single Index Models beyond Gaussian Data Joan Bruna Loucas Pillaud-Vivien Aaron Zweig 78 11 0 28 Jul 2023
Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models Alexandru Damian Eshaan Nichani Rong Ge Jason D. Lee MLT 94 39 0 18 May 2023
Expand-and-Cluster: Parameter Recovery of Neural Networks Flavio Martinelli Berfin Simsek W. Gerstner Johanni Brea 138 8 0 25 Apr 2023
Learning time-scales in two-layers neural networks Raphael Berthier Andrea Montanari Kangjie Zhou 174 38 0 28 Feb 2023
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics Emmanuel Abbe Enric Boix-Adserà Theodor Misiakiewicz FedML MLT 157 86 0 21 Feb 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron Weihang Xu S. Du 83 16 0 20 Feb 2023
Learning Single-Index Models with Shallow Neural Networks A. Bietti Joan Bruna Clayton Sanford M. Song 218 71 0 27 Oct 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit Boaz Barak Benjamin L. Edelman Surbhi Goel Sham Kakade Eran Malach Cyril Zhang 108 133 0 18 Jul 2022
Neural Networks can Learn Representations with Gradient Descent Alexandru Damian Jason D. Lee Mahdi Soltanolkotabi SSL MLT 98 123 0 30 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs Etienne Boursier Loucas Pillaud-Vivien Nicolas Flammarion ODL 68 61 0 02 Jun 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation Jimmy Ba Murat A. Erdogdu Taiji Suzuki Zhichao Wang Denny Wu Greg Yang MLT 91 129 0 03 May 2022
Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II Yossi Arjevani M. Field 63 19 0 21 Jul 2021
Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances Berfin cSimcsek François Ged Arthur Jacot Francesco Spadaro Clément Hongler W. Gerstner Johanni Brea AI4CE 82 102 0 25 May 2021
Learning Polynomials of Few Relevant Dimensions Sitan Chen Raghu Meka 70 40 0 28 Apr 2020
Online stochastic gradient descent on non-convex losses from high-dimensional inference Gerard Ben Arous Reza Gheissari Aukosh Jagannath 80 91 0 23 Mar 2020
Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup Sebastian Goldt Madhu S. Advani Andrew M. Saxe Florent Krzakala Lenka Zdeborová MLT 132 145 0 18 Jun 2019
Gradient Descent Quantizes ReLU Network Features Hartmut Maennel Olivier Bousquet Sylvain Gelly MLT 66 82 0 22 Mar 2018
On the Connection Between Learning Two-Layers Neural Networks and Tensor Decomposition Marco Mondelli Andrea Montanari MLT CML 75 59 0 20 Feb 2018
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks Itay Safran Ohad Shamir 184 265 0 24 Dec 2017
Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity Amit Daniely Roy Frostig Y. Singer 170 345 0 18 Feb 2016
Tensor decompositions for learning latent variable models Anima Anandkumar Rong Ge Daniel J. Hsu Sham Kakade Matus Telgarsky 464 1,150 0 29 Oct 2012