ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.04486
  4. Cited By
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks
  Trained with the Logistic Loss

Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss

11 February 2020
Lénaïc Chizat
Francis R. Bach
    MLT
ArXivPDFHTML

Papers citing "Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss"

50 / 252 papers shown
Title
Implicit Regularization in Hierarchical Tensor Factorization and Deep
  Convolutional Neural Networks
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks
Noam Razin
Asaf Maman
Nadav Cohen
46
29
0
27 Jan 2022
How Infinitely Wide Neural Networks Can Benefit from Multi-task Learning
  -- an Exact Macroscopic Characterization
How Infinitely Wide Neural Networks Can Benefit from Multi-task Learning -- an Exact Macroscopic Characterization
Jakob Heiss
Josef Teichmann
Hanna Wutte
MLT
10
2
0
31 Dec 2021
Integral representations of shallow neural network with Rectified Power
  Unit activation function
Integral representations of shallow neural network with Rectified Power Unit activation function
Ahmed Abdeljawad
Philipp Grohs
12
10
0
20 Dec 2021
Multi-scale Feature Learning Dynamics: Insights for Double Descent
Multi-scale Feature Learning Dynamics: Insights for Double Descent
Mohammad Pezeshki
Amartya Mitra
Yoshua Bengio
Guillaume Lajoie
61
25
0
06 Dec 2021
On the Equivalence between Neural Network and Support Vector Machine
On the Equivalence between Neural Network and Support Vector Machine
Yilan Chen
Wei Huang
Lam M. Nguyen
Tsui-Wei Weng
AAML
25
18
0
11 Nov 2021
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks
A. Shevchenko
Vyacheslav Kungurtsev
Marco Mondelli
MLT
41
13
0
03 Nov 2021
Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity
  Bias
Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias
Kaifeng Lyu
Zhiyuan Li
Runzhe Wang
Sanjeev Arora
MLT
34
69
0
26 Oct 2021
Rethinking Neural vs. Matrix-Factorization Collaborative Filtering: the
  Theoretical Perspectives
Rethinking Neural vs. Matrix-Factorization Collaborative Filtering: the Theoretical Perspectives
Zida Cheng
Chuanwei Ruan
Siheng Chen
Sushant Kumar
Ya Zhang
24
16
0
23 Oct 2021
Actor-critic is implicitly biased towards high entropy optimal policies
Actor-critic is implicitly biased towards high entropy optimal policies
Yuzheng Hu
Ziwei Ji
Matus Telgarsky
60
11
0
21 Oct 2021
Gradient Descent on Infinitely Wide Neural Networks: Global Convergence
  and Generalization
Gradient Descent on Infinitely Wide Neural Networks: Global Convergence and Generalization
Francis R. Bach
Lénaïc Chizat
MLT
23
23
0
15 Oct 2021
On the Double Descent of Random Features Models Trained with SGD
On the Double Descent of Random Features Models Trained with SGD
Fanghui Liu
Johan A. K. Suykens
V. Cevher
MLT
19
10
0
13 Oct 2021
AIR-Net: Adaptive and Implicit Regularization Neural Network for Matrix
  Completion
AIR-Net: Adaptive and Implicit Regularization Neural Network for Matrix Completion
Zhemin Li
Tao Sun
Hongxia Wang
Bao Wang
50
6
0
12 Oct 2021
An Unconstrained Layer-Peeled Perspective on Neural Collapse
An Unconstrained Layer-Peeled Perspective on Neural Collapse
Wenlong Ji
Yiping Lu
Yiliang Zhang
Zhun Deng
Weijie J. Su
135
83
0
06 Oct 2021
On Margin Maximization in Linear and ReLU Networks
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
50
28
0
06 Oct 2021
VC dimension of partially quantized neural networks in the
  overparametrized regime
VC dimension of partially quantized neural networks in the overparametrized regime
Yutong Wang
Clayton D. Scott
22
1
0
06 Oct 2021
Understanding neural networks with reproducing kernel Banach spaces
Understanding neural networks with reproducing kernel Banach spaces
Francesca Bartolucci
E. De Vito
Lorenzo Rosasco
Stefano Vigogna
47
50
0
20 Sep 2021
Interpolation can hurt robust generalization even when there is no noise
Interpolation can hurt robust generalization even when there is no noise
Konstantin Donhauser
Alexandru cTifrea
Michael Aerni
Reinhard Heckel
Fanny Yang
34
14
0
05 Aug 2021
Determining Sentencing Recommendations and Patentability Using a Machine
  Learning Trained Expert System
Determining Sentencing Recommendations and Patentability Using a Machine Learning Trained Expert System
Logan Brown
Reid Pezewski
Jeremy Straub
AILaw
23
2
0
05 Aug 2021
Fake News and Phishing Detection Using a Machine Learning Trained Expert
  System
Fake News and Phishing Detection Using a Machine Learning Trained Expert System
Benjamin Fitzpatrick
X. Liang
Jeremy Straub
32
6
0
04 Aug 2021
Generalization Bounds using Lower Tail Exponents in Stochastic
  Optimizers
Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers
Liam Hodgkinson
Umut Simsekli
Rajiv Khanna
Michael W. Mahoney
22
20
0
02 Aug 2021
Continuous vs. Discrete Optimization of Deep Neural Networks
Continuous vs. Discrete Optimization of Deep Neural Networks
Omer Elkabetz
Nadav Cohen
68
44
0
14 Jul 2021
Generalization by design: Shortcuts to Generalization in Deep Learning
Generalization by design: Shortcuts to Generalization in Deep Learning
P. Táborský
Lars Kai Hansen
OOD
AI4CE
15
0
0
05 Jul 2021
A Theoretical Analysis of Fine-tuning with Linear Teachers
A Theoretical Analysis of Fine-tuning with Linear Teachers
Gal Shachaf
Alon Brutzkus
Amir Globerson
34
17
0
04 Jul 2021
Fast Margin Maximization via Dual Acceleration
Fast Margin Maximization via Dual Acceleration
Ziwei Ji
Nathan Srebro
Matus Telgarsky
15
35
0
01 Jul 2021
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization
  Training, Symmetry, and Sparsity
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity
Arthur Jacot
François Ged
Berfin cSimcsek
Clément Hongler
Franck Gabriel
29
52
0
30 Jun 2021
Can contrastive learning avoid shortcut solutions?
Can contrastive learning avoid shortcut solutions?
Joshua Robinson
Li Sun
Ke Yu
Kayhan Batmanghelich
Stefanie Jegelka
S. Sra
SSL
19
142
0
21 Jun 2021
Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of
  Stochasticity
Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity
Scott Pesme
Loucas Pillaud-Vivien
Nicolas Flammarion
27
99
0
17 Jun 2021
Understanding Deflation Process in Over-parametrized Tensor
  Decomposition
Understanding Deflation Process in Over-parametrized Tensor Decomposition
Rong Ge
Y. Ren
Xiang Wang
Mo Zhou
8
17
0
11 Jun 2021
On Learnability via Gradient Method for Two-Layer ReLU Neural Networks
  in Teacher-Student Setting
On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting
Shunta Akiyama
Taiji Suzuki
MLT
19
13
0
11 Jun 2021
Early-stopped neural networks are consistent
Early-stopped neural networks are consistent
Ziwei Ji
Justin D. Li
Matus Telgarsky
14
36
0
10 Jun 2021
Separation Results between Fixed-Kernel and Feature-Learning Probability
  Metrics
Separation Results between Fixed-Kernel and Feature-Learning Probability Metrics
Carles Domingo-Enrich
Youssef Mroueh
27
1
0
10 Jun 2021
FEAR: A Simple Lightweight Method to Rank Architectures
FEAR: A Simple Lightweight Method to Rank Architectures
Debadeepta Dey
Shital C. Shah
Sébastien Bubeck
OOD
30
4
0
07 Jun 2021
Redundant representations help generalization in wide neural networks
Redundant representations help generalization in wide neural networks
Diego Doimo
Aldo Glielmo
Sebastian Goldt
A. Laio
AI4CE
30
9
0
07 Jun 2021
Stochastic gradient descent with noise of machine learning type. Part
  II: Continuous time analysis
Stochastic gradient descent with noise of machine learning type. Part II: Continuous time analysis
Stephan Wojtowytsch
36
33
0
04 Jun 2021
Neural Collapse Under MSE Loss: Proximity to and Dynamics on the Central
  Path
Neural Collapse Under MSE Loss: Proximity to and Dynamics on the Central Path
X. Y. Han
Vardan Papyan
D. Donoho
AAML
30
136
0
03 Jun 2021
The Dynamics of Gradient Descent for Overparametrized Neural Networks
The Dynamics of Gradient Descent for Overparametrized Neural Networks
Siddhartha Satpathi
R. Srikant
MLT
AI4CE
16
13
0
13 May 2021
Directional Convergence Analysis under Spherically Symmetric
  Distribution
Directional Convergence Analysis under Spherically Symmetric Distribution
Dachao Lin
Zhihua Zhang
MLT
12
0
0
09 May 2021
Relative stability toward diffeomorphisms indicates performance in deep
  nets
Relative stability toward diffeomorphisms indicates performance in deep nets
Leonardo Petrini
Alessandro Favero
Mario Geiger
M. Wyart
OOD
38
15
0
06 May 2021
Two-layer neural networks with values in a Banach space
Two-layer neural networks with values in a Banach space
Yury Korolev
29
23
0
05 May 2021
RATT: Leveraging Unlabeled Data to Guarantee Generalization
RATT: Leveraging Unlabeled Data to Guarantee Generalization
Saurabh Garg
Sivaraman Balakrishnan
J. Zico Kolter
Zachary Chase Lipton
30
30
0
01 May 2021
On Energy-Based Models with Overparametrized Shallow Neural Networks
On Energy-Based Models with Overparametrized Shallow Neural Networks
Carles Domingo-Enrich
A. Bietti
Eric Vanden-Eijnden
Joan Bruna
BDL
33
9
0
15 Apr 2021
Understanding the role of importance weighting for deep learning
Understanding the role of importance weighting for deep learning
Da Xu
Yuting Ye
Chuanwei Ruan
FAtt
39
43
0
28 Mar 2021
Landscape analysis for shallow neural networks: complete classification
  of critical points for affine target functions
Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
24
10
0
19 Mar 2021
Expert System Gradient Descent Style Training: Development of a
  Defensible Artificial Intelligence Technique
Expert System Gradient Descent Style Training: Development of a Defensible Artificial Intelligence Technique
Jeremy Straub
11
27
0
07 Mar 2021
Unintended Effects on Adaptive Learning Rate for Training Neural Network
  with Output Scale Change
Unintended Effects on Adaptive Learning Rate for Training Neural Network with Output Scale Change
Ryuichi Kanoh
M. Sugiyama
8
0
0
05 Mar 2021
Label-Imbalanced and Group-Sensitive Classification under
  Overparameterization
Label-Imbalanced and Group-Sensitive Classification under Overparameterization
Ganesh Ramachandra Kini
Orestis Paraskevas
Samet Oymak
Christos Thrampoulidis
27
93
0
02 Mar 2021
Experiments with Rich Regime Training for Deep Learning
Experiments with Rich Regime Training for Deep Learning
Xinyan Li
A. Banerjee
32
2
0
26 Feb 2021
Do Input Gradients Highlight Discriminative Features?
Do Input Gradients Highlight Discriminative Features?
Harshay Shah
Prateek Jain
Praneeth Netrapalli
AAML
FAtt
21
57
0
25 Feb 2021
Classifying high-dimensional Gaussian mixtures: Where kernel methods
  fail and neural networks succeed
Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed
Maria Refinetti
Sebastian Goldt
Florent Krzakala
Lenka Zdeborová
22
72
0
23 Feb 2021
Approximation and Learning with Deep Convolutional Models: a Kernel
  Perspective
Approximation and Learning with Deep Convolutional Models: a Kernel Perspective
A. Bietti
34
29
0
19 Feb 2021
Previous
123456
Next