Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.13853
Cited By
v1
v2 (latest)
A Unified Approach to Controlling Implicit Regularization via Mirror Descent
24 June 2023
Haoyuan Sun
Khashayar Gatmiry
Kwangjun Ahn
Navid Azizan
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"A Unified Approach to Controlling Implicit Regularization via Mirror Descent"
38 / 38 papers shown
Title
Nonconvex Stochastic Bregman Proximal Gradient Method with Application to Deep Learning
Kuan-Fu Ding
Jingyang Li
Kim-Chuan Toh
97
8
0
26 Jun 2023
Mirror Descent Maximizes Generalized Margin and Can Be Implemented Efficiently
Haoyuan Sun
Kwangjun Ahn
Christos Thrampoulidis
Navid Azizan
OOD
49
22
0
25 May 2022
Fast Rates for Noisy Interpolation Require Rethinking the Effects of Inductive Bias
Konstantin Donhauser
Nicolò Ruggeri
Stefan Stojanovic
Fanny Yang
52
22
0
07 Mar 2022
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
121
30
0
06 Oct 2021
Implicit Regularization of Bregman Proximal Point Algorithm and Mirror Descent on Separable Data
Yan Li
Caleb Ju
Ethan X. Fang
T. Zhao
41
9
0
15 Aug 2021
Fast Margin Maximization via Dual Acceleration
Ziwei Ji
Nathan Srebro
Matus Telgarsky
53
39
0
01 Jul 2021
Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation
M. Belkin
53
186
0
29 May 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
420
5,000
0
24 Feb 2021
The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks
Bohan Wang
Qi Meng
Wei Chen
Tie-Yan Liu
69
36
0
11 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
676
41,483
0
22 Oct 2020
Gradient descent follows the regularization path for general losses
Ziwei Ji
Miroslav Dudík
Robert Schapire
Matus Telgarsky
AI4CE
FaML
150
62
0
19 Jun 2020
Evaluation of Neural Architectures Trained with Square Loss vs Cross-Entropy in Classification Tasks
Like Hui
M. Belkin
UQCV
AAML
VLM
62
172
0
12 Jun 2020
Classification vs regression in overparameterized regimes: Does the loss function matter?
Vidya Muthukumar
Adhyyan Narang
Vignesh Subramanian
M. Belkin
Daniel J. Hsu
A. Sahai
97
151
0
16 May 2020
Designing Network Design Spaces
Ilija Radosavovic
Raj Prateek Kosaraju
Ross B. Girshick
Kaiming He
Piotr Dollár
GNN
102
1,693
0
30 Mar 2020
The Implicit and Explicit Regularization Effects of Dropout
Colin Wei
Sham Kakade
Tengyu Ma
96
118
0
28 Feb 2020
Deep Double Descent: Where Bigger Models and More Data Hurt
Preetum Nakkiran
Gal Kaplun
Yamini Bansal
Tristan Yang
Boaz Barak
Ilya Sutskever
123
945
0
04 Dec 2019
Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks
Ziwei Ji
Matus Telgarsky
72
178
0
26 Sep 2019
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Kaifeng Lyu
Jian Li
98
336
0
13 Jun 2019
Stochastic Mirror Descent on Overparameterized Nonlinear Models: Convergence, Implicit Regularization, and Generalization
Navid Azizan
Sahin Lale
B. Hassibi
146
74
0
10 Jun 2019
A Stochastic Interpretation of Stochastic Mirror Descent: Risk-Sensitive Optimality
Navid Azizan
B. Hassibi
17
5
0
03 Apr 2019
Reconciling modern machine learning practice and the bias-variance trade-off
M. Belkin
Daniel J. Hsu
Siyuan Ma
Soumik Mandal
244
1,659
0
28 Dec 2018
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
198
448
0
21 Nov 2018
A Convergence Theory for Deep Learning via Over-Parameterization
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CE
ODL
266
1,469
0
09 Nov 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
233
1,276
0
04 Oct 2018
Just Interpolate: Kernel "Ridgeless" Regression Can Generalize
Tengyuan Liang
Alexander Rakhlin
81
355
0
01 Aug 2018
On the Implicit Bias of Dropout
Poorya Mianjy
R. Arora
René Vidal
68
67
0
26 Jun 2018
Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization
Navid Azizan
B. Hassibi
48
64
0
04 Jun 2018
Convergence of Gradient Descent on Separable Data
Mor Shpigel Nacson
Jason D. Lee
Suriya Gunasekar
Pedro H. P. Savarese
Nathan Srebro
Daniel Soudry
76
169
0
05 Mar 2018
Characterizing Implicit Bias in Terms of Optimization Geometry
Suriya Gunasekar
Jason D. Lee
Daniel Soudry
Nathan Srebro
AI4CE
73
413
0
22 Feb 2018
MobileNetV2: Inverted Residuals and Linear Bottlenecks
Mark Sandler
Andrew G. Howard
Menglong Zhu
A. Zhmoginov
Liang-Chieh Chen
207
19,335
0
13 Jan 2018
The Implicit Bias of Gradient Descent on Separable Data
Daniel Soudry
Elad Hoffer
Mor Shpigel Nacson
Suriya Gunasekar
Nathan Srebro
163
924
0
27 Oct 2017
Spectrally-normalized margin bounds for neural networks
Peter L. Bartlett
Dylan J. Foster
Matus Telgarsky
ODL
212
1,225
0
26 Jun 2017
The Marginal Value of Adaptive Gradient Methods in Machine Learning
Ashia Wilson
Rebecca Roelofs
Mitchell Stern
Nathan Srebro
Benjamin Recht
ODL
86
1,032
0
23 May 2017
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
194,510
0
10 Dec 2015
In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning
Behnam Neyshabur
Ryota Tomioka
Nathan Srebro
AI4CE
99
662
0
20 Dec 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
1.7K
100,529
0
04 Sep 2014
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
1.7K
39,615
0
01 Sep 2014
Margins, Shrinkage, and Boosting
Matus Telgarsky
82
73
0
18 Mar 2013
1