Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.11055
Cited By
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
21 February 2023
Emmanuel Abbe
Enric Boix-Adserà
Theodor Misiakiewicz
FedML
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics"
27 / 27 papers shown
Title
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
Hongkang Li
Yihua Zhang
Shuai Zhang
Ming Wang
Sijia Liu
Pin-Yu Chen
MoMe
71
4
0
15 Apr 2025
Statistically guided deep learning
Michael Kohler
A. Krzyżak
ODL
BDL
79
0
0
11 Apr 2025
Learning a Single Index Model from Anisotropic Data with vanilla Stochastic Gradient Descent
Guillaume Braun
Minh Ha Quang
Masaaki Imaizumi
MLT
42
0
0
31 Mar 2025
A distributional simplicity bias in the learning dynamics of transformers
Riccardo Rende
Federica Gerace
Alessandro Laio
Sebastian Goldt
79
8
0
17 Feb 2025
Low-dimensional Functions are Efficiently Learnable under Randomly Biased Distributions
Elisabetta Cornacchia
Dan Mikulincer
Elchanan Mossel
77
1
0
10 Feb 2025
Physics of Skill Learning
Ziming Liu
Yizhou Liu
Eric J. Michaud
Jeff Gore
Max Tegmark
49
2
0
21 Jan 2025
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Ziang Chen
Rong Ge
MLT
61
1
0
10 Jan 2025
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
Berfin Simsek
Amire Bendjeddou
Daniel Hsu
46
0
0
13 Nov 2024
Robust Feature Learning for Multi-Index Models in High Dimensions
Alireza Mousavi-Hosseini
Adel Javanmard
Murat A. Erdogdu
OOD
AAML
46
1
0
21 Oct 2024
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Kaiyue Wen
Huaqing Zhang
Hongzhou Lin
Jingzhao Zhang
MoE
LRM
66
2
0
07 Oct 2024
How Feature Learning Can Improve Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
Cengiz Pehlevan
57
12
0
26 Sep 2024
Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics
Alireza Mousavi-Hosseini
Denny Wu
Murat A. Erdogdu
MLT
AI4CE
35
6
0
14 Aug 2024
How Neural Networks Learn the Support is an Implicit Regularization Effect of SGD
Pierfrancesco Beneventano
Andrea Pinto
Tomaso A. Poggio
MLT
32
1
0
17 Jun 2024
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Luca Pesce
Ludovic Stephan
70
12
0
24 May 2024
Complexity Matters: Dynamics of Feature Learning in the Presence of Spurious Correlations
GuanWen Qiu
Da Kuang
Surbhi Goel
32
8
0
05 Mar 2024
Gradient-Based Feature Learning under Structured Data
Alireza Mousavi-Hosseini
Denny Wu
Taiji Suzuki
Murat A. Erdogdu
MLT
37
18
0
07 Sep 2023
Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
48
8
0
07 Sep 2023
On Single Index Models beyond Gaussian Data
Joan Bruna
Loucas Pillaud-Vivien
Aaron Zweig
18
10
0
28 Jul 2023
Tight conditions for when the NTK approximation is valid
Enric Boix-Adserà
Etai Littwin
35
0
0
22 May 2023
Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models
Alexandru Damian
Eshaan Nichani
Rong Ge
Jason D. Lee
MLT
42
33
0
18 May 2023
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
Eshaan Nichani
Alexandru Damian
Jason D. Lee
MLT
44
13
0
11 May 2023
Saddle-to-Saddle Dynamics in Diagonal Linear Networks
Scott Pesme
Nicolas Flammarion
33
35
0
02 Apr 2023
Type-II Saddles and Probabilistic Stability of Stochastic Gradient Descent
Liu Ziyin
Botao Li
Tomer Galanti
Masakuni Ueda
37
7
0
23 Mar 2023
Learning Single-Index Models with Shallow Neural Networks
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
170
68
0
27 Oct 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Alireza Mousavi-Hosseini
Sejun Park
M. Girotti
Ioannis Mitliagkas
Murat A. Erdogdu
MLT
324
48
0
29 Sep 2022
Optimization-Based Separations for Neural Networks
Itay Safran
Jason D. Lee
185
14
0
04 Dec 2021
On the Power of Differentiable Learning versus PAC and SQ Learning
Emmanuel Abbe
Pritish Kamath
Eran Malach
Colin Sandon
Nathan Srebro
MLT
77
23
0
09 Aug 2021
1