ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.08246
  4. Cited By
Characterizing Implicit Bias in Terms of Optimization Geometry
v1v2v3 (latest)

Characterizing Implicit Bias in Terms of Optimization Geometry

22 February 2018
Suriya Gunasekar
Jason D. Lee
Daniel Soudry
Nathan Srebro
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "Characterizing Implicit Bias in Terms of Optimization Geometry"

50 / 290 papers shown
Title
Gradient Descent on Infinitely Wide Neural Networks: Global Convergence
  and Generalization
Gradient Descent on Infinitely Wide Neural Networks: Global Convergence and Generalization
Francis R. Bach
Lénaïc Chizat
MLT
67
24
0
15 Oct 2021
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
Zhiyuan Li
Tianhao Wang
Sanjeev Arora
MLT
121
105
0
13 Oct 2021
The Convex Geometry of Backpropagation: Neural Network Gradient Flows
  Converge to Extreme Points of the Dual Convex Program
The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program
Yifei Wang
Mert Pilanci
MLTMDE
86
11
0
13 Oct 2021
Implicit Bias of Linear Equivariant Networks
Implicit Bias of Linear Equivariant Networks
Hannah Lawrence
Kristian Georgiev
A. Dienes
B. Kiani
AI4CE
125
15
0
12 Oct 2021
On the Self-Penalization Phenomenon in Feature Selection
On the Self-Penalization Phenomenon in Feature Selection
Michael I. Jordan
Keli Liu
Feng Ruan
54
4
0
12 Oct 2021
Does Momentum Change the Implicit Regularization on Separable Data?
Does Momentum Change the Implicit Regularization on Separable Data?
Bohan Wang
Qi Meng
Huishuai Zhang
Ruoyu Sun
Wei Chen
Zhirui Ma
Tie-Yan Liu
99
18
0
08 Oct 2021
On Margin Maximization in Linear and ReLU Networks
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
162
30
0
06 Oct 2021
Spectral Bias in Practice: The Role of Function Frequency in
  Generalization
Spectral Bias in Practice: The Role of Function Frequency in Generalization
Sara Fridovich-Keil
Raphael Gontijo-Lopes
Rebecca Roelofs
111
30
0
06 Oct 2021
Federated Asymptotics: a model to compare federated learning algorithms
Federated Asymptotics: a model to compare federated learning algorithms
Gary Cheng
Karan N. Chadha
John C. Duchi
FedML
90
17
0
16 Aug 2021
Implicit Regularization of Bregman Proximal Point Algorithm and Mirror
  Descent on Separable Data
Implicit Regularization of Bregman Proximal Point Algorithm and Mirror Descent on Separable Data
Yan Li
Caleb Ju
Ethan X. Fang
T. Zhao
69
9
0
15 Aug 2021
Implicit Sparse Regularization: The Impact of Depth and Early Stopping
Implicit Sparse Regularization: The Impact of Depth and Early Stopping
Jiangyuan Li
Thanh V. Nguyen
Chinmay Hegde
R. K. Wong
93
30
0
12 Aug 2021
The Benefits of Implicit Regularization from SGD in Least Squares
  Problems
The Benefits of Implicit Regularization from SGD in Least Squares Problems
Difan Zou
Jingfeng Wu
Vladimir Braverman
Quanquan Gu
Dean Phillips Foster
Sham Kakade
72
31
0
10 Aug 2021
On the Hyperparameters in Stochastic Gradient Descent with Momentum
On the Hyperparameters in Stochastic Gradient Descent with Momentum
Bin Shi
108
14
0
09 Aug 2021
Structured Directional Pruning via Perturbation Orthogonal Projection
Structured Directional Pruning via Perturbation Orthogonal Projection
YinchuanLi
XiaofengLiu
YunfengShao
QingWang
YanhuiGeng
41
2
0
12 Jul 2021
SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs
SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs
Satyen Kale
Ayush Sekhari
Karthik Sridharan
259
29
0
11 Jul 2021
A Theoretical Analysis of Fine-tuning with Linear Teachers
A Theoretical Analysis of Fine-tuning with Linear Teachers
Gal Shachaf
Alon Brutzkus
Amir Globerson
91
17
0
04 Jul 2021
Fast Margin Maximization via Dual Acceleration
Fast Margin Maximization via Dual Acceleration
Ziwei Ji
Nathan Srebro
Matus Telgarsky
67
39
0
01 Jul 2021
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization
  Training, Symmetry, and Sparsity
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity
Arthur Jacot
François Ged
Berfin cSimcsek
Clément Hongler
Franck Gabriel
86
55
0
30 Jun 2021
Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of
  Stochasticity
Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity
Scott Pesme
Loucas Pillaud-Vivien
Nicolas Flammarion
80
108
0
17 Jun 2021
What can linearized neural networks actually say about generalization?
What can linearized neural networks actually say about generalization?
Guillermo Ortiz-Jiménez
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
86
45
0
12 Jun 2021
Understanding Deflation Process in Over-parametrized Tensor
  Decomposition
Understanding Deflation Process in Over-parametrized Tensor Decomposition
Rong Ge
Y. Ren
Xiang Wang
Mo Zhou
80
19
0
11 Jun 2021
Label Noise SGD Provably Prefers Flat Global Minimizers
Label Noise SGD Provably Prefers Flat Global Minimizers
Alexandru Damian
Tengyu Ma
Jason D. Lee
NoLa
142
120
0
11 Jun 2021
From inexact optimization to learning via gradient concentration
From inexact optimization to learning via gradient concentration
Bernhard Stankewitz
Nicole Mücke
Lorenzo Rosasco
88
5
0
09 Jun 2021
Redundant representations help generalization in wide neural networks
Redundant representations help generalization in wide neural networks
Diego Doimo
Aldo Glielmo
Sebastian Goldt
Alessandro Laio
AI4CE
79
9
0
07 Jun 2021
Implicit Regularization in Matrix Sensing via Mirror Descent
Implicit Regularization in Matrix Sensing via Mirror Descent
Fan Wu
Patrick Rebeschini
42
11
0
28 May 2021
An Upper Limit of Decaying Rate with Respect to Frequency in Deep Neural
  Network
An Upper Limit of Decaying Rate with Respect to Frequency in Deep Neural Network
Yaoyu Zhang
Zheng Ma
Zhiwei Wang
Z. Xu
Yaoyu Zhang
41
4
0
25 May 2021
A Geometric Analysis of Neural Collapse with Unconstrained Features
A Geometric Analysis of Neural Collapse with Unconstrained Features
Zhihui Zhu
Tianyu Ding
Jinxin Zhou
Xiao Li
Chong You
Jeremias Sulam
Qing Qu
88
204
0
06 May 2021
Risk Bounds for Over-parameterized Maximum Margin Classification on
  Sub-Gaussian Mixtures
Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures
Yuan Cao
Quanquan Gu
M. Belkin
81
53
0
28 Apr 2021
Minimum complexity interpolation in random features models
Minimum complexity interpolation in random features models
Michael Celentano
Theodor Misiakiewicz
Andrea Montanari
44
4
0
30 Mar 2021
Inductive Bias of Multi-Channel Linear Convolutional Networks with
  Bounded Weight Norm
Inductive Bias of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm
Meena Jagadeesan
Ilya P. Razenshteyn
Suriya Gunasekar
113
21
0
24 Feb 2021
Implicit Regularization in Tensor Factorization
Implicit Regularization in Tensor Factorization
Noam Razin
Asaf Maman
Nadav Cohen
75
49
0
19 Feb 2021
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal
  Mirror Descent
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent
Shahar Azulay
E. Moroshko
Mor Shpigel Nacson
Blake E. Woodworth
Nathan Srebro
Amir Globerson
Daniel Soudry
AI4CE
89
74
0
19 Feb 2021
Bridging the Gap Between Adversarial Robustness and Optimization Bias
Bridging the Gap Between Adversarial Robustness and Optimization Bias
Fartash Faghri
Sven Gowal
C. N. Vasconcelos
David J. Fleet
Fabian Pedregosa
Nicolas Le Roux
AAML
234
7
0
17 Feb 2021
Dissecting Supervised Contrastive Learning
Dissecting Supervised Contrastive Learning
Florian Graf
Christoph Hofer
Marc Niethammer
Roland Kwitt
SSL
164
77
0
17 Feb 2021
SGD Generalizes Better Than GD (And Regularization Doesn't Help)
SGD Generalizes Better Than GD (And Regularization Doesn't Help)
I Zaghloul Amir
Tomer Koren
Roi Livni
70
46
0
01 Feb 2021
Painless step size adaptation for SGD
Painless step size adaptation for SGD
I. Kulikovskikh
Tarzan Legović
72
0
0
01 Feb 2021
Sparse Signal Models for Data Augmentation in Deep Learning ATR
Sparse Signal Models for Data Augmentation in Deep Learning ATR
Tushar Agarwal
Nithin Sugavanam
Emre Ertin
24
12
0
16 Dec 2020
NeurIPS 2020 Competition: Predicting Generalization in Deep Learning
NeurIPS 2020 Competition: Predicting Generalization in Deep Learning
Yiding Jiang
Pierre Foret
Scott Yak
Daniel M. Roy
H. Mobahi
Gintare Karolina Dziugaite
Samy Bengio
Suriya Gunasekar
Isabelle M Guyon
Behnam Neyshabur Google Research
OOD
74
55
0
14 Dec 2020
The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous
  Neural Networks
The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks
Bohan Wang
Qi Meng
Wei Chen
Tie-Yan Liu
84
36
0
11 Dec 2020
Implicit Regularization in ReLU Networks with the Square Loss
Implicit Regularization in ReLU Networks with the Square Loss
Gal Vardi
Ohad Shamir
89
51
0
09 Dec 2020
On Generalization of Adaptive Methods for Over-parameterized Linear
  Regression
On Generalization of Adaptive Methods for Over-parameterized Linear Regression
Vatsal Shah
Soumya Basu
Anastasios Kyrillidis
Sujay Sanghavi
AI4CE
59
4
0
28 Nov 2020
Deep orthogonal linear networks are shallow
Deep orthogonal linear networks are shallow
Pierre Ablin
ODL
19
3
0
27 Nov 2020
Implicit bias of deep linear networks in the large learning rate phase
Implicit bias of deep linear networks in the large learning rate phase
Wei Huang
Weitao Du
R. Xu
Chunrui Liu
77
2
0
25 Nov 2020
Implicit bias of any algorithm: bounding bias via margin
Implicit bias of any algorithm: bounding bias via margin
Elvis Dohmatob
27
0
0
12 Nov 2020
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent
  with Moderate Learning Rate
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate
Jingfeng Wu
Difan Zou
Vladimir Braverman
Quanquan Gu
102
18
0
04 Nov 2020
Inductive Bias of Gradient Descent for Weight Normalized Smooth
  Homogeneous Neural Nets
Inductive Bias of Gradient Descent for Weight Normalized Smooth Homogeneous Neural Nets
Depen Morwani
H. G. Ramaswamy
50
3
0
24 Oct 2020
How Data Augmentation affects Optimization for Linear Regression
How Data Augmentation affects Optimization for Linear Regression
Boris Hanin
Yi Sun
83
16
0
21 Oct 2020
The Deep Bootstrap Framework: Good Online Learners are Good Offline
  Generalizers
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers
Preetum Nakkiran
Behnam Neyshabur
Hanie Sedghi
OffRL
97
11
0
16 Oct 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks
A Unifying View on Implicit Bias in Training Linear Neural Networks
Chulhee Yun
Shankar Krishnan
H. Mobahi
MLT
125
82
0
06 Oct 2020
Implicit Gradient Regularization
Implicit Gradient Regularization
David Barrett
Benoit Dherin
101
152
0
23 Sep 2020
Previous
123456
Next