Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1901.08584
Cited By
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
24 January 2019
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks"
50 / 239 papers shown
Title
Learning from Training Dynamics: Identifying Mislabeled Data Beyond Manually Designed Features
Qingrui Jia
Xuhong Li
Lei Yu
Jiang Bian
Penghao Zhao
Shupeng Li
Haoyi Xiong
Dejing Dou
NoLa
35
5
0
19 Dec 2022
Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs
Chenxiao Yang
Qitian Wu
Jiahua Wang
Junchi Yan
AI4CE
19
51
0
18 Dec 2022
Learning threshold neurons via the "edge of stability"
Kwangjun Ahn
Sébastien Bubeck
Sinho Chewi
Y. Lee
Felipe Suarez
Yi Zhang
MLT
38
36
0
14 Dec 2022
Leveraging Unlabeled Data to Track Memorization
Mahsa Forouzesh
Hanie Sedghi
Patrick Thiran
NoLa
TDI
34
4
0
08 Dec 2022
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing
Josh Alman
Jiehao Liang
Zhao Song
Ruizhe Zhang
Danyang Zhuo
77
31
0
25 Nov 2022
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States
Ziqiao Wang
Yongyi Mao
30
10
0
19 Nov 2022
Mechanistic Mode Connectivity
Ekdeep Singh Lubana
Eric J. Bigelow
Robert P. Dick
David M. Krueger
Hidenori Tanaka
32
45
0
15 Nov 2022
Characterizing the Spectrum of the NTK via a Power Series Expansion
Michael Murray
Hui Jin
Benjamin Bowman
Guido Montúfar
38
11
0
15 Nov 2022
Do highly over-parameterized neural networks generalize since bad solutions are rare?
Julius Martinetz
T. Martinetz
30
1
0
07 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
27
5
0
28 Oct 2022
The Curious Case of Benign Memorization
Sotiris Anagnostidis
Gregor Bachmann
Lorenzo Noci
Thomas Hofmann
AAML
49
8
0
25 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets
Pulkit Gopalani
Anirbit Mukherjee
26
5
0
20 Oct 2022
Data-Efficient Augmentation for Training Neural Networks
Tian Yu Liu
Baharan Mirzasoleiman
18
7
0
15 Oct 2022
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?
Nikolaos Tsilivis
Julia Kempe
AAML
47
17
0
11 Oct 2022
Efficient NTK using Dimensionality Reduction
Nir Ailon
Supratim Shit
38
0
0
10 Oct 2022
The Dynamic of Consensus in Deep Networks and the Identification of Noisy Labels
Daniel Shwartz
Uri Stern
D. Weinshall
NoLa
36
2
0
02 Oct 2022
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition
Jianhao Ma
Li-Zhen Guo
S. Fattahi
38
4
0
01 Oct 2022
On the optimization and generalization of overparameterized implicit neural networks
Tianxiang Gao
Hongyang Gao
MLT
AI4CE
19
3
0
30 Sep 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Alireza Mousavi-Hosseini
Sejun Park
M. Girotti
Ioannis Mitliagkas
Murat A. Erdogdu
MLT
324
48
0
29 Sep 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
96
6
0
27 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in
1
d
1d
1
d
R. Gentile
G. Welper
ODL
54
6
0
17 Sep 2022
Generalization Properties of NAS under Activation and Skip Connection Search
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
V. Cevher
AI4CE
28
15
0
15 Sep 2022
On Generalization of Decentralized Learning with Separable Data
Hossein Taheri
Christos Thrampoulidis
FedML
36
11
0
15 Sep 2022
On the generalization of learning algorithms that do not converge
N. Chandramoorthy
Andreas Loukas
Khashayar Gatmiry
Stefanie Jegelka
MLT
19
11
0
16 Aug 2022
Towards Understanding Mixture of Experts in Deep Learning
Zixiang Chen
Yihe Deng
Yue-bo Wu
Quanquan Gu
Yuan-Fang Li
MLT
MoE
39
53
0
04 Aug 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
19
44
0
26 Jul 2022
Can we achieve robustness from data alone?
Nikolaos Tsilivis
Jingtong Su
Julia Kempe
OOD
DD
36
18
0
24 Jul 2022
Single Model Uncertainty Estimation via Stochastic Data Centering
Jayaraman J. Thiagarajan
Rushil Anirudh
V. Narayanaswamy
P. Bremer
UQCV
OOD
30
26
0
14 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
42
27
0
08 Jul 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity
Jianyi Yang
Shaolei Ren
32
3
0
02 Jul 2022
Neural Networks can Learn Representations with Gradient Descent
Alexandru Damian
Jason D. Lee
Mahdi Soltanolkotabi
SSL
MLT
25
114
0
30 Jun 2022
Momentum Diminishes the Effect of Spectral Bias in Physics-Informed Neural Networks
G. Farhani
Alexander Kazachek
Boyu Wang
27
6
0
29 Jun 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis
Alexander Munteanu
Simon Omlor
Zhao Song
David P. Woodruff
33
15
0
26 Jun 2022
Fast Finite Width Neural Tangent Kernel
Roman Novak
Jascha Narain Sohl-Dickstein
S. Schoenholz
AAML
25
53
0
17 Jun 2022
On the fast convergence of minibatch heavy ball momentum
Raghu Bollapragada
Tyler Chen
Rachel A. Ward
29
17
0
15 Jun 2022
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
45
70
0
14 Jun 2022
Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks
Kaiqi Zhang
Ming Yin
Yu-Xiang Wang
MQ
24
4
0
13 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
32
2
0
13 Jun 2022
Neural Collapse: A Review on Modelling Principles and Generalization
Vignesh Kothapalli
25
74
0
08 Jun 2022
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Eshaan Nichani
Yunzhi Bai
Jason D. Lee
29
10
0
08 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
27
58
0
02 Jun 2022
Gaussian Pre-Activations in Neural Networks: Myth or Reality?
Pierre Wolinski
Julyan Arbel
AI4CE
76
8
0
24 May 2022
Trading Positional Complexity vs. Deepness in Coordinate Networks
Jianqiao Zheng
Sameera Ramasinghe
Xueqian Li
Simon Lucey
31
18
0
18 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
27
34
0
12 May 2022
On Feature Learning in Neural Networks with Global Convergence Guarantees
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
36
13
0
22 Apr 2022
Theory of Graph Neural Networks: Representation and Learning
Stefanie Jegelka
GNN
AI4CE
33
68
0
16 Apr 2022
Graph Neural Networks for Wireless Communications: From Theory to Practice
Yifei Shen
Jun Zhang
Shenghui Song
Khaled B. Letaief
GNN
AI4CE
33
111
0
21 Mar 2022
Generalization Through The Lens Of Leave-One-Out Error
Gregor Bachmann
Thomas Hofmann
Aurelien Lucchi
64
7
0
07 Mar 2022
The Spectral Bias of Polynomial Neural Networks
Moulik Choraria
L. Dadi
Grigorios G. Chrysos
Julien Mairal
V. Cevher
24
18
0
27 Feb 2022
Benefit of Interpolation in Nearest Neighbor Algorithms
Yue Xing
Qifan Song
Guang Cheng
14
28
0
23 Feb 2022
Previous
1
2
3
4
5
Next