Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.04486
Cited By
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss
11 February 2020
Lénaïc Chizat
Francis R. Bach
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss"
50 / 252 papers shown
Title
Minimum norm interpolation by perceptra: Explicit regularization and implicit bias
Jiyoung Park
Ian Pelakh
Stephan Wojtowytsch
45
2
0
10 Nov 2023
A Quadratic Synchronization Rule for Distributed Deep Learning
Xinran Gu
Kaifeng Lyu
Sanjeev Arora
Jingzhao Zhang
Longbo Huang
54
1
0
22 Oct 2023
Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit
Blake Bordelon
Lorenzo Noci
Mufan Li
Boris Hanin
Cengiz Pehlevan
35
23
0
28 Sep 2023
SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem
Margalit Glasgow
MLT
82
13
0
26 Sep 2023
Globally Convergent Accelerated Algorithms for Multilinear Sparse Logistic Regression with
ℓ
0
\ell_0
ℓ
0
-constraints
Weifeng Yang
Wenwen Min
18
0
0
17 Sep 2023
Connecting NTK and NNGP: A Unified Theoretical Framework for Wide Neural Network Learning Dynamics
Yehonatan Avidan
Qianyi Li
H. Sompolinsky
60
8
0
08 Sep 2023
Six Lectures on Linearized Neural Networks
Theodor Misiakiewicz
Andrea Montanari
42
12
0
25 Aug 2023
An Exact Kernel Equivalence for Finite Classification Models
Brian Bell
Michaela Geyer
David Glickenstein
Amanda Fernandez
Juston Moore
27
2
0
01 Aug 2023
Noisy Interpolation Learning with Shallow Univariate ReLU Networks
Nirmit Joshi
Gal Vardi
Nathan Srebro
32
8
0
28 Jul 2023
Trajectory Alignment: Understanding the Edge of Stability Phenomenon via Bifurcation Theory
Minhak Song
Chulhee Yun
31
9
1
09 Jul 2023
Abide by the Law and Follow the Flow: Conservation Laws for Gradient Flows
Sibylle Marcotte
Rémi Gribonval
Gabriel Peyré
30
16
0
30 Jun 2023
Max-Margin Token Selection in Attention Mechanism
Davoud Ataee Tarzanagh
Yingcong Li
Xuechen Zhang
Samet Oymak
40
38
0
23 Jun 2023
Scaling MLPs: A Tale of Inductive Bias
Gregor Bachmann
Sotiris Anagnostidis
Thomas Hofmann
34
38
0
23 Jun 2023
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks
Yuan Cao
Difan Zou
Yuan-Fang Li
Quanquan Gu
MLT
37
5
0
20 Jun 2023
The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions
Nishil Patel
Sebastian Lee
Stefano Sarao Mannelli
Sebastian Goldt
Adrew Saxe
OffRL
28
3
0
17 Jun 2023
Exact Count of Boundary Pieces of ReLU Classifiers: Towards the Proper Complexity Measure for Classification
Paweł Piwek
Adam Klukowski
Tianyang Hu
11
5
0
15 Jun 2023
A Mathematical Abstraction for Balancing the Trade-off Between Creativity and Reality in Large Language Models
Ritwik Sinha
Zhao-quan Song
Dinesh Manocha
22
23
0
04 Jun 2023
Bottleneck Structure in Learned Features: Low-Dimension vs Regularity Tradeoff
Arthur Jacot
MLT
26
13
0
30 May 2023
A Rainbow in Deep Network Black Boxes
Florentin Guth
Brice Ménard
G. Rochette
S. Mallat
27
10
0
29 May 2023
On the Role of Noise in the Sample Complexity of Learning Recurrent Neural Networks: Exponential Gaps for Long Sequences
A. F. Pour
H. Ashtiani
21
0
0
28 May 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
Hossein Taheri
Christos Thrampoulidis
MLT
16
3
0
22 May 2023
The Training Process of Many Deep Networks Explores the Same Low-Dimensional Manifold
Jialin Mao
Itay Griniasty
H. Teoh
Rahul Ramesh
Rubing Yang
Mark K. Transtrum
James P. Sethna
Pratik Chaudhari
3DPC
42
15
0
02 May 2023
Saddle-to-Saddle Dynamics in Diagonal Linear Networks
Scott Pesme
Nicolas Flammarion
31
35
0
02 Apr 2023
On the Effect of Initialization: The Scaling Path of 2-Layer Neural Networks
Sebastian Neumayer
Lénaïc Chizat
M. Unser
27
2
0
31 Mar 2023
Solving Regularized Exp, Cosh and Sinh Regression Problems
Zhihang Li
Zhao-quan Song
Dinesh Manocha
31
39
0
28 Mar 2023
Global Optimality of Elman-type RNN in the Mean-Field Regime
Andrea Agazzi
Jian-Xiong Lu
Sayan Mukherjee
MLT
34
1
0
12 Mar 2023
Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization
Spencer Frei
Gal Vardi
Peter L. Bartlett
Nathan Srebro
30
22
0
02 Mar 2023
The Double-Edged Sword of Implicit Bias: Generalization vs. Robustness in ReLU Networks
Spencer Frei
Gal Vardi
Peter L. Bartlett
Nathan Srebro
37
17
0
02 Mar 2023
Penalising the biases in norm regularisation enforces sparsity
Etienne Boursier
Nicolas Flammarion
37
14
0
02 Mar 2023
On the existence of minimizers in shallow residual ReLU neural network optimization landscapes
Steffen Dereich
Arnulf Jentzen
Sebastian Kassing
29
6
0
28 Feb 2023
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Emmanuel Abbe
Enric Boix-Adserà
Theodor Misiakiewicz
FedML
MLT
79
73
0
21 Feb 2023
Generalization and Stability of Interpolating Neural Networks with Minimal Width
Hossein Taheri
Christos Thrampoulidis
37
16
0
18 Feb 2023
Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic Gradient Descent
Benjamin Gess
Sebastian Kassing
Vitalii Konarovskyi
DiffM
32
6
0
14 Feb 2023
Efficient displacement convex optimization with particle gradient descent
Hadi Daneshmand
J. Lee
Chi Jin
26
5
0
09 Feb 2023
Simplicity Bias in 1-Hidden Layer Neural Networks
Depen Morwani
Jatin Batra
Prateek Jain
Praneeth Netrapalli
21
17
0
01 Feb 2023
Naive imputation implicitly regularizes high-dimensional linear models
Alexis Ayme
Claire Boyer
Aymeric Dieuleveut
Erwan Scornet
AI4CE
9
6
0
31 Jan 2023
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Emmanuel Abbe
Samy Bengio
Aryo Lotfi
Kevin Rizk
LRM
39
49
0
30 Jan 2023
Deep Learning Meets Sparse Regularization: A Signal Processing Perspective
Rahul Parhi
Robert D. Nowak
40
25
0
23 Jan 2023
Understanding Difficulty-based Sample Weighting with a Universal Difficulty Measure
Xiaoling Zhou
Ou Wu
Weiyao Zhu
Ziyang Liang
27
2
0
12 Jan 2023
Iterative regularization in classification via hinge loss diagonal descent
Vassilis Apidopoulos
T. Poggio
Lorenzo Rosasco
S. Villa
26
2
0
24 Dec 2022
Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs
Chenxiao Yang
Qitian Wu
Jiahua Wang
Junchi Yan
AI4CE
19
51
0
18 Dec 2022
On the symmetries in the dynamics of wide two-layer neural networks
Karl Hajjar
Lénaïc Chizat
15
11
0
16 Nov 2022
Regression as Classification: Influence of Task Formulation on Neural Network Features
Lawrence Stewart
Francis R. Bach
Quentin Berthet
Jean-Philippe Vert
29
24
0
10 Nov 2022
Duality for Neural Networks through Reproducing Kernel Banach Spaces
L. Spek
T. J. Heeringa
Felix L. Schwenninger
C. Brune
21
13
0
09 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
25
5
0
28 Oct 2022
Learning Single-Index Models with Shallow Neural Networks
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
167
67
0
27 Oct 2022
Vision Transformers provably learn spatial structure
Samy Jelassi
Michael E. Sander
Yuan-Fang Li
ViT
MLT
34
74
0
13 Oct 2022
Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data
Spencer Frei
Gal Vardi
Peter L. Bartlett
Nathan Srebro
Wei Hu
MLT
30
38
0
13 Oct 2022
Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence
Diyuan Wu
Vyacheslav Kungurtsev
Marco Mondelli
23
3
0
13 Oct 2022
SGD with Large Step Sizes Learns Sparse Features
Maksym Andriushchenko
Aditya Varre
Loucas Pillaud-Vivien
Nicolas Flammarion
45
56
0
11 Oct 2022
Previous
1
2
3
4
5
6
Next