Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1710.10345
Cited By
The Implicit Bias of Gradient Descent on Separable Data
27 October 2017
Daniel Soudry
Elad Hoffer
Mor Shpigel Nacson
Suriya Gunasekar
Nathan Srebro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Implicit Bias of Gradient Descent on Separable Data"
50 / 244 papers shown
Title
Malign Overfitting: Interpolation Can Provably Preclude Invariance
Yoav Wald
G. Yona
Uri Shalit
Y. Carmon
17
6
0
28 Nov 2022
Mechanistic Mode Connectivity
Ekdeep Singh Lubana
Eric J. Bigelow
Robert P. Dick
David M. Krueger
Hidenori Tanaka
34
45
0
15 Nov 2022
Regression as Classification: Influence of Task Formulation on Neural Network Features
Lawrence Stewart
Francis R. Bach
Quentin Berthet
Jean-Philippe Vert
35
24
0
10 Nov 2022
Do highly over-parameterized neural networks generalize since bad solutions are rare?
Julius Martinetz
T. Martinetz
30
1
0
07 Nov 2022
Instance-Dependent Generalization Bounds via Optimal Transport
Songyan Hou
Parnian Kassraie
Anastasis Kratsios
Andreas Krause
Jonas Rothfuss
22
6
0
02 Nov 2022
Grokking phase transitions in learning local rules with gradient descent
Bojan Žunkovič
E. Ilievski
63
16
0
26 Oct 2022
Interpolating Discriminant Functions in High-Dimensional Gaussian Latent Mixtures
Xin Bing
M. Wegkamp
21
1
0
25 Oct 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
40
49
0
25 Oct 2022
Improving Out-of-Distribution Generalization by Adversarial Training with Structured Priors
Qixun Wang
Yifei Wang
Hong Zhu
Yisen Wang
OOD
22
19
0
13 Oct 2022
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
Satyen Kale
Jason D. Lee
Chris De Sa
Ayush Sekhari
Karthik Sridharan
29
4
0
13 Oct 2022
SGD with Large Step Sizes Learns Sparse Features
Maksym Andriushchenko
Aditya Varre
Loucas Pillaud-Vivien
Nicolas Flammarion
45
56
0
11 Oct 2022
Class-wise and reduced calibration methods
Michael Panchenko
Anes Benmerzoug
Miguel de Benito Delgado
21
0
0
07 Oct 2022
Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals
Rohin Shah
Vikrant Varma
Ramana Kumar
Mary Phuong
Victoria Krakovna
J. Uesato
Zachary Kenton
40
68
0
04 Oct 2022
The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima
Peter L. Bartlett
Philip M. Long
Olivier Bousquet
76
34
0
04 Oct 2022
Learning an Invertible Output Mapping Can Mitigate Simplicity Bias in Neural Networks
Sravanti Addepalli
Anshul Nasery
R. Venkatesh Babu
Praneeth Netrapalli
Prateek Jain
AAML
38
3
0
04 Oct 2022
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions
Arthur Jacot
41
25
0
29 Sep 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Alireza Mousavi-Hosseini
Sejun Park
M. Girotti
Ioannis Mitliagkas
Murat A. Erdogdu
MLT
324
48
0
29 Sep 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
96
6
0
27 Sep 2022
Approximate Description Length, Covering Numbers, and VC Dimension
Amit Daniely
Gal Katzhendler
16
0
0
26 Sep 2022
A Validation Approach to Over-parameterized Matrix and Image Recovery
Lijun Ding
Zhen Qin
Liwei Jiang
Jinxin Zhou
Zhihui Zhu
48
13
0
21 Sep 2022
Deep Linear Networks can Benignly Overfit when Shallow Ones Do
Niladri S. Chatterji
Philip M. Long
23
8
0
19 Sep 2022
Importance Tempering: Group Robustness for Overparameterized Models
Yiping Lu
Wenlong Ji
Zachary Izzo
Lexing Ying
47
7
0
19 Sep 2022
On Generalization of Decentralized Learning with Separable Data
Hossein Taheri
Christos Thrampoulidis
FedML
42
11
0
15 Sep 2022
Solving Elliptic Problems with Singular Sources using Singularity Splitting Deep Ritz Method
Tianhao Hu
Bangti Jin
Zhi Zhou
31
6
0
07 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
Gal Vardi
FedML
AI4CE
34
72
0
26 Aug 2022
Intersection of Parallels as an Early Stopping Criterion
Ali Vardasbi
Maarten de Rijke
Mostafa Dehghani
MoMe
41
5
0
19 Aug 2022
On the generalization of learning algorithms that do not converge
N. Chandramoorthy
Andreas Loukas
Khashayar Gatmiry
Stefanie Jegelka
MLT
19
11
0
16 Aug 2022
Imbalance Trouble: Revisiting Neural-Collapse Geometry
Christos Thrampoulidis
Ganesh Ramachandra Kini
V. Vakilian
Tina Behnia
35
69
0
10 Aug 2022
On the Activation Function Dependence of the Spectral Bias of Neural Networks
Q. Hong
Jonathan W. Siegel
Qinyan Tan
Jinchao Xu
34
23
0
09 Aug 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
45
27
0
08 Jul 2022
How many labelers do you have? A closer look at gold-standard labels
Chen Cheng
Hilal Asi
John C. Duchi
21
6
0
24 Jun 2022
Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation
Loucas Pillaud-Vivien
J. Reygner
Nicolas Flammarion
NoLa
33
31
0
20 Jun 2022
Reconstructing Training Data from Trained Neural Networks
Niv Haim
Gal Vardi
Gilad Yehudai
Ohad Shamir
Michal Irani
40
132
0
15 Jun 2022
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
45
71
0
14 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
32
2
0
13 Jun 2022
Neural Collapse: A Review on Modelling Principles and Generalization
Vignesh Kothapalli
30
74
0
08 Jun 2022
Adversarial Reprogramming Revisited
Matthias Englert
R. Lazic
AAML
29
9
0
07 Jun 2022
Building Robust Ensembles via Margin Boosting
Dinghuai Zhang
Hongyang R. Zhang
Aaron Courville
Yoshua Bengio
Pradeep Ravikumar
A. Suggala
AAML
UQCV
48
15
0
07 Jun 2022
What do CNNs Learn in the First Layer and Why? A Linear Systems Perspective
Rhea Chowers
Yair Weiss
33
2
0
06 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
27
58
0
02 Jun 2022
A Blessing of Dimensionality in Membership Inference through Regularization
Jasper Tan
Daniel LeJeune
Blake Mason
Hamid Javadi
Richard G. Baraniuk
32
18
0
27 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
32
34
0
12 May 2022
Investigating Generalization by Controlling Normalized Margin
Alexander R. Farhang
Jeremy Bernstein
Kushal Tirumala
Yang Liu
Yisong Yue
31
6
0
08 May 2022
Why adversarial training can hurt robust accuracy
Jacob Clarysse
Julia Hörrmann
Fanny Yang
AAML
13
18
0
03 Mar 2022
Robust Training under Label Noise by Over-parameterization
Sheng Liu
Zhihui Zhu
Qing Qu
Chong You
NoLa
OOD
32
106
0
28 Feb 2022
The Spectral Bias of Polynomial Neural Networks
Moulik Choraria
L. Dadi
Grigorios G. Chrysos
Julien Mairal
V. Cevher
24
18
0
27 Feb 2022
Stability vs Implicit Bias of Gradient Methods on Separable Data and Beyond
Matan Schliserman
Tomer Koren
24
23
0
27 Feb 2022
Thinking Outside the Ball: Optimal Learning with Gradient Descent for Generalized Linear Stochastic Convex Optimization
I Zaghloul Amir
Roi Livni
Nathan Srebro
32
6
0
27 Feb 2022
A Data-Augmentation Is Worth A Thousand Samples: Exact Quantification From Analytical Augmented Sample Moments
Randall Balestriero
Ishan Misra
Yann LeCun
35
20
0
16 Feb 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
46
2
0
15 Feb 2022
Previous
1
2
3
4
5
Next