Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1802.08246
Cited By
Characterizing Implicit Bias in Terms of Optimization Geometry
22 February 2018
Suriya Gunasekar
Jason D. Lee
Daniel Soudry
Nathan Srebro
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Characterizing Implicit Bias in Terms of Optimization Geometry"
50 / 110 papers shown
Title
Embedding principle of homogeneous neural network for classification problem
Jiahan Zhang
Yaoyu Zhang
Tao Luo
17
0
0
18 May 2025
Entropic Mirror Descent for Linear Systems: Polyak's Stepsize and Implicit Bias
Yura Malitsky
Alexander Posch
27
0
0
05 May 2025
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks
Chenyang Zhang
Peifeng Gao
Difan Zou
Yuan Cao
OOD
MLT
68
0
0
11 Apr 2025
Theory on Mixture-of-Experts in Continual Learning
Hongbo Li
Sen-Fon Lin
Lingjie Duan
Yingbin Liang
Ness B. Shroff
MoE
MoMe
CLL
155
14
0
20 Feb 2025
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao
Tina Behnia
V. Vakilian
Christos Thrampoulidis
70
9
0
20 Feb 2025
The late-stage training dynamics of (stochastic) subgradient descent on homogeneous neural networks
Sholom Schechtman
Nicolas Schreuder
233
0
0
08 Feb 2025
Optimization Insights into Deep Diagonal Linear Networks
Hippolyte Labarrière
C. Molinari
Lorenzo Rosasco
S. Villa
Cristian Vega
81
0
0
21 Dec 2024
Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning
Mohammadamin Banayeeanzade
Mahdi Soltanolkotabi
Mohammad Rostami
CLL
LRM
103
1
0
29 Aug 2024
Mask in the Mirror: Implicit Sparsification
Tom Jacobs
R. Burkholz
49
3
0
19 Aug 2024
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning
Arthur Jacot
Seok Hoan Choi
Yuxiao Wen
AI4CE
94
2
0
08 Jul 2024
Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets
Arthur Jacot
Alexandre Kaiser
38
0
0
27 May 2024
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAI
CoGe
73
6
0
26 May 2024
Hidden Synergy:
L
1
L_1
L
1
Weight Normalization and 1-Path-Norm Regularization
Aditya Biswas
41
0
0
29 Apr 2024
Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey
Feng Liang
Zhen Zhang
Haifeng Lu
Victor C. M. Leung
Yanyi Guo
Xiping Hu
GNN
39
6
0
09 Apr 2024
Implicit Bias of AdamW:
ℓ
∞
\ell_\infty
ℓ
∞
Norm Constrained Optimization
Shuo Xie
Zhiyuan Li
OffRL
50
13
0
05 Apr 2024
High-dimensional analysis of ridge regression for non-identically distributed data with a variance profile
Jérémie Bigot
Issa-Mbenard Dabo
Camille Male
31
4
0
29 Mar 2024
Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks
Hristo Papazov
Scott Pesme
Nicolas Flammarion
38
5
0
08 Mar 2024
Which Frequencies do CNNs Need? Emergent Bottleneck Structure in Feature Learning
Yuxiao Wen
Arthur Jacot
65
6
0
12 Feb 2024
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
50
1
0
29 Nov 2023
Precise Asymptotic Generalization for Multiclass Classification with Overparameterized Linear Models
David X. Wu
A. Sahai
29
2
0
23 Jun 2023
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks
Yuan Cao
Difan Zou
Yuan-Fang Li
Quanquan Gu
MLT
37
5
0
20 Jun 2023
Unraveling Projection Heads in Contrastive Learning: Insights from Expansion and Shrinkage
Yu Gui
Cong Ma
Yiqiao Zhong
27
7
0
06 Jun 2023
Bottleneck Structure in Learned Features: Low-Dimension vs Regularity Tradeoff
Arthur Jacot
MLT
26
13
0
30 May 2023
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability
Jingfeng Wu
Vladimir Braverman
Jason D. Lee
32
17
0
19 May 2023
Robust Implicit Regularization via Weight Normalization
H. Chou
Holger Rauhut
Rachel A. Ward
40
7
0
09 May 2023
General Loss Functions Lead to (Approximate) Interpolation in High Dimensions
Kuo-Wei Lai
Vidya Muthukumar
34
5
0
13 Mar 2023
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
Kayhan Behdin
Qingquan Song
Aman Gupta
S. Keerthi
Ayan Acharya
Borja Ocejo
Gregory Dexter
Rajiv Khanna
D. Durfee
Rahul Mazumder
AAML
18
7
0
19 Feb 2023
Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression
Mo Zhou
Rong Ge
37
2
0
01 Feb 2023
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Emmanuel Abbe
Samy Bengio
Aryo Lotfi
Kevin Rizk
LRM
48
49
0
30 Jan 2023
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing
Jikai Jin
Zhiyuan Li
Kaifeng Lyu
S. Du
Jason D. Lee
MLT
54
34
0
27 Jan 2023
Iterative regularization in classification via hinge loss diagonal descent
Vassilis Apidopoulos
T. Poggio
Lorenzo Rosasco
S. Villa
29
2
0
24 Dec 2022
Tight bounds for maximum
ℓ
1
\ell_1
ℓ
1
-margin classifiers
Stefan Stojanovic
Konstantin Donhauser
Fanny Yang
42
0
0
07 Dec 2022
Regression as Classification: Influence of Task Formulation on Neural Network Features
Lawrence Stewart
Francis R. Bach
Quentin Berthet
Jean-Philippe Vert
37
24
0
10 Nov 2022
Stochastic Mirror Descent in Average Ensemble Models
Taylan Kargin
Fariborz Salehi
B. Hassibi
26
1
0
27 Oct 2022
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
Satyen Kale
Jason D. Lee
Chris De Sa
Ayush Sekhari
Karthik Sridharan
31
4
0
13 Oct 2022
Annihilation of Spurious Minima in Two-Layer ReLU Networks
Yossi Arjevani
M. Field
16
8
0
12 Oct 2022
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions
Arthur Jacot
46
25
0
29 Sep 2022
Why neural networks find simple solutions: the many regularizers of geometric complexity
Benoit Dherin
Michael Munn
M. Rosca
David Barrett
57
31
0
27 Sep 2022
Deep Linear Networks can Benignly Overfit when Shallow Ones Do
Niladri S. Chatterji
Philip M. Long
23
8
0
19 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
Gal Vardi
FedML
AI4CE
36
73
0
26 Aug 2022
Imbalance Trouble: Revisiting Neural-Collapse Geometry
Christos Thrampoulidis
Ganesh Ramachandra Kini
V. Vakilian
Tina Behnia
35
70
0
10 Aug 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
45
28
0
08 Jul 2022
Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation
Loucas Pillaud-Vivien
J. Reygner
Nicolas Flammarion
NoLa
33
31
0
20 Jun 2022
Reconstructing Training Data from Trained Neural Networks
Niv Haim
Gal Vardi
Gilad Yehudai
Ohad Shamir
Michal Irani
40
132
0
15 Jun 2022
How catastrophic can catastrophic forgetting be in linear regression?
Itay Evron
E. Moroshko
Rachel A. Ward
Nati Srebro
Daniel Soudry
CLL
35
48
0
19 May 2022
Thinking Outside the Ball: Optimal Learning with Gradient Descent for Generalized Linear Stochastic Convex Optimization
I Zaghloul Amir
Roi Livni
Nathan Srebro
32
6
0
27 Feb 2022
A Note on Machine Learning Approach for Computational Imaging
Bin Dong
26
0
0
24 Feb 2022
Benign Overfitting in Adversarially Robust Linear Classification
Jinghui Chen
Yuan Cao
Quanquan Gu
AAML
SILM
34
10
0
31 Dec 2021
Rethinking Neural vs. Matrix-Factorization Collaborative Filtering: the Theoretical Perspectives
Zida Cheng
Chuanwei Ruan
Siheng Chen
Sushant Kumar
Ya Zhang
27
16
0
23 Oct 2021
The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program
Yifei Wang
Mert Pilanci
MLT
MDE
55
11
0
13 Oct 2021
1
2
3
Next