Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1802.08246
Cited By
v1
v2
v3 (latest)
Characterizing Implicit Bias in Terms of Optimization Geometry
22 February 2018
Suriya Gunasekar
Jason D. Lee
Daniel Soudry
Nathan Srebro
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Characterizing Implicit Bias in Terms of Optimization Geometry"
50 / 290 papers shown
Title
Optimal Implicit Bias in Linear Regression
K. N. Varma
Babak Hassibi
30
0
0
20 Jun 2025
Constant Stepsize Local GD for Logistic Regression: Acceleration by Instability
M. Crawshaw
Blake Woodworth
Mingrui Liu
29
0
0
16 Jun 2025
Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers
Yixiao Huang
Hanlin Zhu
Tianyu Guo
Jiantao Jiao
Somayeh Sojoudi
Michael I. Jordan
Stuart Russell
Song Mei
LRM
126
0
0
12 Jun 2025
Replay Can Provably Increase Forgetting
Yasaman Mahdaviyeh
James Lucas
Mengye Ren
A. Tolias
R. Zemel
T. Pitassi
KELM
CLL
82
0
0
04 Jun 2025
HAM: A Hyperbolic Step to Regulate Implicit Bias
Tom Jacobs
Advait Gadhikar
Celia Rubio-Madrigal
R. Burkholz
79
0
0
03 Jun 2025
Unlocking the Power of Rehearsal in Continual Learning: A Theoretical Perspective
Junze Deng
Qinhang Wu
Peizhong Ju
Sen Lin
Yingbin Liang
Ness B. Shroff
CLL
22
0
0
30 May 2025
The Rich and the Simple: On the Implicit Bias of Adam and SGD
Bhavya Vasudeva
Jung Whan Lee
Vatsal Sharan
Mahdi Soltanolkotabi
24
0
0
29 May 2025
Variational Deep Learning via Implicit Regularization
Jonathan Wenger
Beau Coker
Juraj Marusic
John P. Cunningham
OOD
UQCV
BDL
56
0
0
26 May 2025
Embedding principle of homogeneous neural network for classification problem
Jiahan Zhang
Yaoyu Zhang
Yaoyu Zhang
86
0
0
18 May 2025
Entropic Mirror Descent for Linear Systems: Polyak's Stepsize and Implicit Bias
Yura Malitsky
Alexander Posch
54
0
0
05 May 2025
Sign-In to the Lottery: Reparameterizing Sparse Training From Scratch
Advait Gadhikar
Tom Jacobs
Chao Zhou
R. Burkholz
74
0
0
17 Apr 2025
Mirror, Mirror of the Flow: How Does Regularization Shape Implicit Bias?
Tom Jacobs
Chao Zhou
R. Burkholz
OffRL
AI4CE
74
1
0
17 Apr 2025
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks
Chenyang Zhang
Peifeng Gao
Difan Zou
Yuan Cao
OOD
MLT
159
0
0
11 Apr 2025
Make Optimization Once and for All with Fine-grained Guidance
Mingjia Shi
Ruihan Lin
Xuxi Chen
Yuhao Zhou
Zezhen Ding
...
Tong Wang
Kai Wang
Zhangyang Wang
Jing Zhang
Tianlong Chen
121
1
0
14 Mar 2025
Early-Stopped Mirror Descent for Linear Regression over Convex Bodies
Tobias Wegel
Gil Kur
Patrick Rebeschini
92
0
0
05 Mar 2025
Theory on Mixture-of-Experts in Continual Learning
Hongbo Li
Sen-Fon Lin
Lingjie Duan
Yingbin Liang
Ness B. Shroff
MoE
MoMe
CLL
279
17
0
20 Feb 2025
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao
Tina Behnia
V. Vakilian
Christos Thrampoulidis
193
10
0
20 Feb 2025
The late-stage training dynamics of (stochastic) subgradient descent on homogeneous neural networks
Sholom Schechtman
Nicolas Schreuder
455
0
0
08 Feb 2025
Optimization Insights into Deep Diagonal Linear Networks
Hippolyte Labarrière
C. Molinari
Lorenzo Rosasco
S. Villa
Cristian Vega
218
1
0
21 Dec 2024
Slowing Down Forgetting in Continual Learning
Pascal Janetzky
Tobias Schlagenhauf
Stefan Feuerriegel
CLL
121
0
0
11 Nov 2024
The Implicit Bias of Gradient Descent on Separable Multiclass Data
Hrithik Ravi
Clayton Scott
Daniel Soudry
Yutong Wang
120
4
0
02 Nov 2024
Simplicity Bias via Global Convergence of Sharpness Minimization
Khashayar Gatmiry
Zhiyuan Li
Sashank J. Reddi
Stefanie Jegelka
59
1
0
21 Oct 2024
A Mirror Descent Perspective of Smoothed Sign Descent
Shuyang Wang
Diego Klabjan
75
1
0
18 Oct 2024
MUSO: Achieving Exact Machine Unlearning in Over-Parameterized Regimes
Ruikai Yang
Mingzhen He
Zhengbao He
Youmei Qiu
Xiaolin Huang
MU
BDL
77
1
0
11 Oct 2024
On the Inductive Bias of Stacking Towards Improving Reasoning
Nikunj Saunshi
Stefani Karp
Shankar Krishnan
Sobhan Miryoosefi
Sashank J. Reddi
Sanjiv Kumar
LRM
AI4CE
88
7
0
27 Sep 2024
Can We Theoretically Quantify the Impacts of Local Updates on the Generalization Performance of Federated Learning?
Peizhong Ju
Haibo Yang
Jia Liu
Yingbin Liang
Ness B. Shroff
FedML
80
0
0
05 Sep 2024
Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning
Mohammadamin Banayeeanzade
Mahdi Soltanolkotabi
Mohammad Rostami
CLL
LRM
311
4
0
29 Aug 2024
Mask in the Mirror: Implicit Sparsification
Tom Jacobs
R. Burkholz
191
4
0
19 Aug 2024
Local vs Global continual learning
Giulia Lanzillotta
Sidak Pal Singh
Benjamin Grewe
Thomas Hofmann
CLL
80
0
0
23 Jul 2024
Why Do You Grok? A Theoretical Analysis of Grokking Modular Addition
Mohamad Amin Mohamadi
Zhiyuan Li
Lei Wu
Danica J. Sutherland
112
11
0
17 Jul 2024
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning
Arthur Jacot
Seok Hoan Choi
Yuxiao Wen
AI4CE
143
2
0
08 Jul 2024
Landscaping Linear Mode Connectivity
Sidak Pal Singh
Linara Adilova
Michael Kamp
Asja Fischer
Bernhard Scholkopf
Thomas Hofmann
120
6
0
24 Jun 2024
Implicit Bias of Mirror Flow on Separable Data
Scott Pesme
Radu-Alexandru Dragomir
Nicolas Flammarion
84
1
0
18 Jun 2024
The Implicit Bias of Adam on Separable Data
Chenyang Zhang
Difan Zou
Yuan Cao
AI4CE
94
9
0
15 Jun 2024
On the Minimal Degree Bias in Generalization on the Unseen for non-Boolean Functions
Denys Pushkin
Raphael Berthier
Emmanuel Abbe
65
0
0
10 Jun 2024
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
D. Kunin
Allan Raventós
Clémentine Dominé
Feng Chen
David Klindt
Andrew M. Saxe
Surya Ganguli
MLT
129
18
0
10 Jun 2024
The Price of Implicit Bias in Adversarially Robust Generalization
Nikolaos Tsilivis
Natalie Frank
Nathan Srebro
Julia Kempe
106
4
0
07 Jun 2024
A Universal Class of Sharpness-Aware Minimization Algorithms
B. Tahmasebi
Ashkan Soleymani
Dara Bahri
Stefanie Jegelka
Patrick Jaillet
AAML
81
3
0
06 Jun 2024
Towards a Sampling Theory for Implicit Neural Representations
Mahrokh Najaf
Gregory Ongie
79
0
0
28 May 2024
Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets
Arthur Jacot
Alexandre Kaiser
72
1
0
27 May 2024
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAI
CoGe
253
10
0
26 May 2024
Hidden Synergy:
L
1
L_1
L
1
Weight Normalization and 1-Path-Norm Regularization
Aditya Biswas
81
1
0
29 Apr 2024
Regularized Gauss-Newton for Optimizing Overparameterized Neural Networks
Adeyemi Damilare Adeoye
Philipp Christian Petersen
Alberto Bemporad
65
1
0
23 Apr 2024
Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey
Feng Liang
Zhen Zhang
Haifeng Lu
Victor C. M. Leung
Yanyi Guo
Xiping Hu
GNN
103
8
0
09 Apr 2024
Implicit Bias of AdamW:
ℓ
∞
\ell_\infty
ℓ
∞
Norm Constrained Optimization
Shuo Xie
Zhiyuan Li
OffRL
80
23
0
05 Apr 2024
High-dimensional analysis of ridge regression for non-identically distributed data with a variance profile
Jérémie Bigot
Issa-Mbenard Dabo
Camille Male
97
4
0
29 Mar 2024
On the Benefits of Over-parameterization for Out-of-Distribution Generalization
Yifan Hao
Yong Lin
Difan Zou
Tong Zhang
OODD
OOD
88
6
0
26 Mar 2024
The Effectiveness of Local Updates for Decentralized Learning under Data Heterogeneity
Tongle Wu
Ying Sun
51
1
0
23 Mar 2024
Implicit Regularization of Gradient Flow on One-Layer Softmax Attention
Heejune Sheen
Siyu Chen
Tianhao Wang
Harrison H. Zhou
MLT
87
13
0
13 Mar 2024
Improving Implicit Regularization of SGD with Preconditioning for Least Square Problems
Junwei Su
Difan Zou
Chuan Wu
111
0
0
13 Mar 2024
1
2
3
4
5
6
Next