ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.04486
  4. Cited By
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks
  Trained with the Logistic Loss

Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss

11 February 2020
Lénaïc Chizat
Francis R. Bach
    MLT
ArXivPDFHTML

Papers citing "Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss"

50 / 252 papers shown
Title
Boosting Adversarial Robustness From The Perspective of Effective Margin
  Regularization
Boosting Adversarial Robustness From The Perspective of Effective Margin Regularization
Ziquan Liu
Antoni B. Chan
AAML
30
5
0
11 Oct 2022
The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
D. Kunin
Atsushi Yamamura
Chao Ma
Surya Ganguli
19
20
0
07 Oct 2022
Goal Misgeneralization: Why Correct Specifications Aren't Enough For
  Correct Goals
Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals
Rohin Shah
Vikrant Varma
Ramana Kumar
Mary Phuong
Victoria Krakovna
J. Uesato
Zachary Kenton
37
68
0
04 Oct 2022
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear
  Functions
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions
Arthur Jacot
36
25
0
29 Sep 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with
  SGD
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Alireza Mousavi-Hosseini
Sejun Park
M. Girotti
Ioannis Mitliagkas
Murat A. Erdogdu
MLT
324
48
0
29 Sep 2022
Importance Tempering: Group Robustness for Overparameterized Models
Importance Tempering: Group Robustness for Overparameterized Models
Yiping Lu
Wenlong Ji
Zachary Izzo
Lexing Ying
42
7
0
19 Sep 2022
On Generalization of Decentralized Learning with Separable Data
On Generalization of Decentralized Learning with Separable Data
Hossein Taheri
Christos Thrampoulidis
FedML
27
10
0
15 Sep 2022
Optimal bump functions for shallow ReLU networks: Weight decay, depth
  separation and the curse of dimensionality
Optimal bump functions for shallow ReLU networks: Weight decay, depth separation and the curse of dimensionality
Stephan Wojtowytsch
25
1
0
02 Sep 2022
Incremental Learning in Diagonal Linear Networks
Incremental Learning in Diagonal Linear Networks
Raphael Berthier
CLL
AI4CE
33
16
0
31 Aug 2022
On the Implicit Bias in Deep-Learning Algorithms
On the Implicit Bias in Deep-Learning Algorithms
Gal Vardi
FedML
AI4CE
34
72
0
26 Aug 2022
Feature selection with gradient descent on two-layer networks in
  low-rotation regimes
Feature selection with gradient descent on two-layer networks in low-rotation regimes
Matus Telgarsky
MLT
31
16
0
04 Aug 2022
Data-driven initialization of deep learning solvers for
  Hamilton-Jacobi-Bellman PDEs
Data-driven initialization of deep learning solvers for Hamilton-Jacobi-Bellman PDEs
Anastasia Borovykh
D. Kalise
Alexis Laignelet
P. Parpas
21
6
0
19 Jul 2022
Normalized gradient flow optimization in the training of ReLU artificial
  neural networks
Normalized gradient flow optimization in the training of ReLU artificial neural networks
Simon Eberle
Arnulf Jentzen
Adrian Riekert
G. Weiss
31
0
0
13 Jul 2022
Towards understanding how momentum improves generalization in deep
  learning
Towards understanding how momentum improves generalization in deep learning
Samy Jelassi
Yuanzhi Li
ODL
MLT
AI4CE
30
31
0
13 Jul 2022
Synergy and Symmetry in Deep Learning: Interactions between the Data,
  Model, and Inference Algorithm
Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm
Lechao Xiao
Jeffrey Pennington
34
10
0
11 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On
  Equivalence to Mirror Descent
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
42
27
0
08 Jul 2022
Automating the Design and Development of Gradient Descent Trained Expert
  System Networks
Automating the Design and Development of Gradient Descent Trained Expert System Networks
Jeremy Straub
29
9
0
04 Jul 2022
Learning sparse features can lead to overfitting in neural networks
Learning sparse features can lead to overfitting in neural networks
Leonardo Petrini
Francesco Cagnetta
Eric Vanden-Eijnden
M. Wyart
MLT
42
23
0
24 Jun 2022
Label noise (stochastic) gradient descent implicitly solves the Lasso
  for quadratic parametrisation
Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation
Loucas Pillaud-Vivien
J. Reygner
Nicolas Flammarion
NoLa
33
31
0
20 Jun 2022
How You Start Matters for Generalization
How You Start Matters for Generalization
Sameera Ramasinghe
L. MacDonald
M. Farazi
Hemanth Saratchandran
Simon Lucey
ODL
AI4CE
36
6
0
17 Jun 2022
Reconstructing Training Data from Trained Neural Networks
Reconstructing Training Data from Trained Neural Networks
Niv Haim
Gal Vardi
Gilad Yehudai
Ohad Shamir
Michal Irani
40
132
0
15 Jun 2022
The Manifold Hypothesis for Gradient-Based Explanations
The Manifold Hypothesis for Gradient-Based Explanations
Sebastian Bordt
Uddeshya Upadhyay
Zeynep Akata
U. V. Luxburg
FAtt
AAML
28
12
0
15 Jun 2022
Benefits of Additive Noise in Composing Classes with Bounded Capacity
Benefits of Additive Noise in Composing Classes with Bounded Capacity
A. F. Pour
H. Ashtiani
33
3
0
14 Jun 2022
Understanding the Generalization Benefit of Normalization Layers:
  Sharpness Reduction
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
40
69
0
14 Jun 2022
Towards Understanding Sharpness-Aware Minimization
Towards Understanding Sharpness-Aware Minimization
Maksym Andriushchenko
Nicolas Flammarion
AAML
35
133
0
13 Jun 2022
Explicit Regularization in Overparametrized Models via Noise Injection
Explicit Regularization in Overparametrized Models via Noise Injection
Antonio Orvieto
Anant Raj
Hans Kersting
Francis R. Bach
10
26
0
09 Jun 2022
What do CNNs Learn in the First Layer and Why? A Linear Systems
  Perspective
What do CNNs Learn in the First Layer and Why? A Linear Systems Perspective
Rhea Chowers
Yair Weiss
33
2
0
06 Jun 2022
Understanding Deep Learning via Decision Boundary
Understanding Deep Learning via Decision Boundary
Shiye Lei
Fengxiang He
Yancheng Yuan
Dacheng Tao
19
13
0
03 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and
  orthogonal inputs
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
24
58
0
02 Jun 2022
Feature Learning in $L_{2}$-regularized DNNs: Attraction/Repulsion and
  Sparsity
Feature Learning in L2L_{2}L2​-regularized DNNs: Attraction/Repulsion and Sparsity
Arthur Jacot
Eugene Golikov
Clément Hongler
Franck Gabriel
MLT
23
17
0
31 May 2022
Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student
  Settings and its Superiority to Kernel Methods
Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel Methods
Shunta Akiyama
Taiji Suzuki
32
6
0
30 May 2022
The impact of memory on learning sequence-to-sequence tasks
The impact of memory on learning sequence-to-sequence tasks
Alireza Seif
S. Loos
Gennaro Tucci
É. Roldán
Sebastian Goldt
23
5
0
29 May 2022
Learning to Reason with Neural Networks: Generalization, Unseen Data and
  Boolean Measures
Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures
Emmanuel Abbe
Samy Bengio
Elisabetta Cornacchia
Jon M. Kleinberg
Aryo Lotfi
M. Raghu
Chiyuan Zhang
MLT
16
10
0
26 May 2022
On Bridging the Gap between Mean Field and Finite Width in Deep Random
  Neural Networks with Batch Normalization
On Bridging the Gap between Mean Field and Finite Width in Deep Random Neural Networks with Batch Normalization
Amir Joudaki
Hadi Daneshmand
Francis R. Bach
AI4CE
19
2
0
25 May 2022
A Case of Exponential Convergence Rates for SVM
A Case of Exponential Convergence Rates for SVM
Vivien A. Cabannes
Stefano Vigogna
19
2
0
20 May 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU
  Networks: Convergence Guarantees and Implicit Bias
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
Itay Safran
Gal Vardi
Jason D. Lee
MLT
59
23
0
18 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised
  Learning
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
27
34
0
12 May 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step
  Improves the Representation
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
MLT
40
121
0
03 May 2022
On Feature Learning in Neural Networks with Global Convergence
  Guarantees
On Feature Learning in Neural Networks with Global Convergence Guarantees
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
36
13
0
22 Apr 2022
High-dimensional Asymptotics of Langevin Dynamics in Spiked Matrix
  Models
High-dimensional Asymptotics of Langevin Dynamics in Spiked Matrix Models
Tengyuan Liang
Subhabrata Sen
Pragya Sur
39
7
0
09 Apr 2022
Deep Regression Ensembles
Deep Regression Ensembles
Antoine Didisheim
Bryan Kelly
Semyon Malamud
UQCV
9
4
0
10 Mar 2022
Fast Rates for Noisy Interpolation Require Rethinking the Effects of
  Inductive Bias
Fast Rates for Noisy Interpolation Require Rethinking the Effects of Inductive Bias
Konstantin Donhauser
Nicolò Ruggeri
Stefan Stojanovic
Fanny Yang
18
21
0
07 Mar 2022
Why adversarial training can hurt robust accuracy
Why adversarial training can hurt robust accuracy
Jacob Clarysse
Julia Hörrmann
Fanny Yang
AAML
13
18
0
03 Mar 2022
Thinking Outside the Ball: Optimal Learning with Gradient Descent for
  Generalized Linear Stochastic Convex Optimization
Thinking Outside the Ball: Optimal Learning with Gradient Descent for Generalized Linear Stochastic Convex Optimization
I Zaghloul Amir
Roi Livni
Nathan Srebro
30
6
0
27 Feb 2022
A Note on Machine Learning Approach for Computational Imaging
A Note on Machine Learning Approach for Computational Imaging
Bin Dong
26
0
0
24 Feb 2022
Stochastic linear optimization never overfits with quadratically-bounded
  losses on general data
Stochastic linear optimization never overfits with quadratically-bounded losses on general data
Matus Telgarsky
11
11
0
14 Feb 2022
Is interpolation benign for random forest regression?
Is interpolation benign for random forest regression?
Ludovic Arnould
Claire Boyer
Erwan Scornet
14
6
0
08 Feb 2022
Iterative regularization for low complexity regularizers
Iterative regularization for low complexity regularizers
C. Molinari
Mathurin Massias
Lorenzo Rosasco
S. Villa
22
5
0
01 Feb 2022
Implicit Regularization Towards Rank Minimization in ReLU Networks
Implicit Regularization Towards Rank Minimization in ReLU Networks
Nadav Timor
Gal Vardi
Ohad Shamir
34
49
0
30 Jan 2022
Limitation of Characterizing Implicit Regularization by Data-independent
  Functions
Limitation of Characterizing Implicit Regularization by Data-independent Functions
Leyang Zhang
Z. Xu
Tao Luo
Yaoyu Zhang
16
0
0
28 Jan 2022
Previous
123456
Next