Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.05890
Cited By
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
13 June 2019
Kaifeng Lyu
Jian Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Gradient Descent Maximizes the Margin of Homogeneous Neural Networks"
50 / 245 papers shown
Title
Directional Convergence Near Small Initializations and Saddles in Two-Homogeneous Neural Networks
Akshay Kumar
Jarvis Haupt
ODL
30
7
0
14 Feb 2024
How Uniform Random Weights Induce Non-uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers
G. Buzaglo
I. Harel
Mor Shpigel Nacson
Alon Brutzkus
Nathan Srebro
Daniel Soudry
62
3
0
09 Feb 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
34
13
0
08 Feb 2024
Understanding the Generalization Benefits of Late Learning Rate Decay
Yinuo Ren
Chao Ma
Lexing Ying
AI4CE
32
6
0
21 Jan 2024
Generator Born from Classifier
Runpeng Yu
Xinchao Wang
35
4
0
05 Dec 2023
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
Kaifeng Lyu
Jikai Jin
Zhiyuan Li
Simon S. Du
Jason D. Lee
Wei Hu
AI4CE
41
32
0
30 Nov 2023
Applying statistical learning theory to deep learning
Cédric Gerbelot
Avetik G. Karagulyan
Stefani Karp
Kavya Ravichandran
Menachem Stern
Nathan Srebro
FedML
16
2
0
26 Nov 2023
Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling
Mingze Wang
Zeping Min
Lei Wu
30
3
0
24 Nov 2023
Feature emergence via margin maximization: case studies in algebraic tasks
Depen Morwani
Benjamin L. Edelman
Costin-Andrei Oncescu
Rosie Zhao
Sham Kakade
39
14
0
13 Nov 2023
On the Robustness of Neural Collapse and the Neural Collapse of Robustness
Jingtong Su
Ya Shi Zhang
Nikolaos Tsilivis
Julia Kempe
AAML
34
4
0
13 Nov 2023
Vanishing Gradients in Reinforcement Finetuning of Language Models
Noam Razin
Hattie Zhou
Omid Saremi
Vimal Thilak
Arwen Bradley
Preetum Nakkiran
Josh Susskind
Etai Littwin
18
7
0
31 Oct 2023
A Quadratic Synchronization Rule for Distributed Deep Learning
Xinran Gu
Kaifeng Lyu
Sanjeev Arora
Jingzhao Zhang
Longbo Huang
54
1
0
22 Oct 2023
Fundamental Limits of Membership Inference Attacks on Machine Learning Models
Eric Aubinais
Elisabeth Gassiat
Pablo Piantanida
MIACV
50
2
0
20 Oct 2023
Deep Neural Networks Tend To Extrapolate Predictably
Katie Kang
Amrith Rajagopal Setlur
Claire Tomlin
Sergey Levine
23
0
0
02 Oct 2023
SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem
Margalit Glasgow
MLT
79
13
0
26 Sep 2023
Graph Neural Networks Use Graphs When They Shouldn't
Maya Bechler-Speicher
Ido Amos
Ran Gilad-Bachrach
Amir Globerson
GNN
AI4CE
13
15
0
08 Sep 2023
Explaining grokking through circuit efficiency
Vikrant Varma
Rohin Shah
Zachary Kenton
János Kramár
Ramana Kumar
18
48
0
05 Sep 2023
Implicit regularization of deep residual networks towards neural ODEs
P. Marion
Yu-Han Wu
Michael E. Sander
Gérard Biau
34
14
0
03 Sep 2023
On the Implicit Bias of Adam
M. D. Cattaneo
Jason M. Klusowski
Boris Shigida
31
17
0
31 Aug 2023
Transformers as Support Vector Machines
Davoud Ataee Tarzanagh
Yingcong Li
Christos Thrampoulidis
Samet Oymak
48
43
0
31 Aug 2023
Six Lectures on Linearized Neural Networks
Theodor Misiakiewicz
Andrea Montanari
39
12
0
25 Aug 2023
Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods
A. Ma
Yangchen Pan
Amir-massoud Farahmand
AAML
25
5
0
13 Aug 2023
Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization
Hancheng Min
Enrique Mallada
René Vidal
MLT
34
18
0
24 Jul 2023
Deconstructing Data Reconstruction: Multiclass, Weight Decay and General Losses
G. Buzaglo
Niv Haim
Gilad Yehudai
Gal Vardi
Yakir Oz
Yaniv Nikankin
Michal Irani
34
10
0
04 Jul 2023
The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks
Mor Shpigel Nacson
Rotem Mulayoff
Greg Ongie
T. Michaeli
Daniel Soudry
20
12
0
30 Jun 2023
A Unified Approach to Controlling Implicit Regularization via Mirror Descent
Haoyuan Sun
Khashayar Gatmiry
Kwangjun Ahn
Navid Azizan
AI4CE
21
10
0
24 Jun 2023
Max-Margin Token Selection in Attention Mechanism
Davoud Ataee Tarzanagh
Yingcong Li
Xuechen Zhang
Samet Oymak
37
38
0
23 Jun 2023
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks
Yuan Cao
Difan Zou
Yuan-Fang Li
Quanquan Gu
MLT
37
5
0
20 Jun 2023
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated Inputs
D. Chistikov
Matthias Englert
R. Lazic
MLT
36
12
0
10 Jun 2023
Revealing Model Biases: Assessing Deep Neural Networks via Recovered Sample Analysis
M. Mehmanchi
Mahbod Nouri
Mohammad Sabokrou
AAML
30
1
0
10 Jun 2023
A Mathematical Abstraction for Balancing the Trade-off Between Creativity and Reality in Large Language Models
Ritwik Sinha
Zhao-quan Song
Dinesh Manocha
22
23
0
04 Jun 2023
Initialization-Dependent Sample Complexity of Linear Predictors and Neural Networks
Roey Magen
Ohad Shamir
19
0
0
25 May 2023
Dendritic Integration Based Quadratic Neural Networks Outperform Traditional Aritificial Ones
Chongmin Liu
Songting Li
Douglas Zhou
18
0
0
25 May 2023
From Tempered to Benign Overfitting in ReLU Neural Networks
Guy Kornowski
Gilad Yehudai
Ohad Shamir
20
12
0
24 May 2023
ReLU Neural Networks with Linear Layers are Biased Towards Single- and Multi-Index Models
Suzanna Parkinson
Greg Ongie
Rebecca Willett
65
6
0
24 May 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
Hossein Taheri
Christos Thrampoulidis
MLT
16
3
0
22 May 2023
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability
Jingfeng Wu
Vladimir Braverman
Jason D. Lee
29
17
0
19 May 2023
Deep ReLU Networks Have Surprisingly Simple Polytopes
Fenglei Fan
Wei Huang
Xiang-yu Zhong
Lecheng Ruan
T. Zeng
Huan Xiong
Fei-Yue Wang
64
5
0
16 May 2023
Reconstructing Training Data from Multiclass Neural Networks
G. Buzaglo
Niv Haim
Gilad Yehudai
Gal Vardi
Michal Irani
33
4
0
05 May 2023
A Study of Neural Collapse Phenomenon: Grassmannian Frame, Symmetry and Generalization
Peifeng Gao
Qianqian Xu
Peisong Wen
Huiyang Shao
Zhiyong Yang
Qingming Huang
22
6
0
18 Apr 2023
Saddle-to-Saddle Dynamics in Diagonal Linear Networks
Scott Pesme
Nicolas Flammarion
31
35
0
02 Apr 2023
Solving Regularized Exp, Cosh and Sinh Regression Problems
Zhihang Li
Zhao-quan Song
Dinesh Manocha
31
39
0
28 Mar 2023
On the Implicit Geometry of Cross-Entropy Parameterizations for Label-Imbalanced Data
Tina Behnia
Ganesh Ramachandra Kini
V. Vakilian
Christos Thrampoulidis
44
17
0
14 Mar 2023
On the Implicit Bias of Linear Equivariant Steerable Networks
Ziyu Chen
Wei-wei Zhu
29
3
0
07 Mar 2023
Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization
Spencer Frei
Gal Vardi
Peter L. Bartlett
Nathan Srebro
30
22
0
02 Mar 2023
The Double-Edged Sword of Implicit Bias: Generalization vs. Robustness in ReLU Networks
Spencer Frei
Gal Vardi
Peter L. Bartlett
Nathan Srebro
37
17
0
02 Mar 2023
Penalising the biases in norm regularisation enforces sparsity
Etienne Boursier
Nicolas Flammarion
37
14
0
02 Mar 2023
Transformed Low-Rank Parameterization Can Help Robust Generalization for Tensor Neural Networks
Andong Wang
Chong Li
Mingyuan Bai
Zhong Jin
Guoxu Zhou
Qianchuan Zhao
OOD
AAML
13
5
0
01 Mar 2023
On the Training Instability of Shuffling SGD with Batch Normalization
David Wu
Chulhee Yun
S. Sra
29
4
0
24 Feb 2023
Generalization and Stability of Interpolating Neural Networks with Minimal Width
Hossein Taheri
Christos Thrampoulidis
32
16
0
18 Feb 2023
Previous
1
2
3
4
5
Next