Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.03962
Cited By
A Convergence Theory for Deep Learning via Over-Parameterization
9 November 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CE
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Convergence Theory for Deep Learning via Over-Parameterization"
50 / 354 papers shown
Title
Theoretical Guarantees for Permutation-Equivariant Quantum Neural Networks
Louis Schatzki
Martín Larocca
Quynh T. Nguyen
F. Sauvage
M. Cerezo
44
85
0
18 Oct 2022
Review Learning: Alleviating Catastrophic Forgetting with Generative Replay without Generator
Jaesung Yoo
Sung-Hyuk Choi
Yewon Yang
Suhyeon Kim
J. Choi
...
H. J. Joo
Dae-Jung Kim
R. Park
Hyeong-Jin Yoon
Kwangsoo Kim
KELM
OffRL
40
0
0
17 Oct 2022
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
Satyen Kale
Jason D. Lee
Chris De Sa
Ayush Sekhari
Karthik Sridharan
27
4
0
13 Oct 2022
Towards Theoretically Inspired Neural Initialization Optimization
Yibo Yang
Hong Wang
Haobo Yuan
Zhouchen Lin
26
9
0
12 Oct 2022
On skip connections and normalisation layers in deep optimisation
L. MacDonald
Jack Valmadre
Hemanth Saratchandran
Simon Lucey
ODL
32
1
0
10 Oct 2022
Stability Analysis and Generalization Bounds of Adversarial Training
Jiancong Xiao
Yanbo Fan
Ruoyu Sun
Jue Wang
Zhimin Luo
AAML
32
30
0
03 Oct 2022
Adaptive Smoothness-weighted Adversarial Training for Multiple Perturbations with Its Stability Analysis
Jiancong Xiao
Zeyu Qin
Yanbo Fan
Baoyuan Wu
Jue Wang
Zhimin Luo
AAML
31
7
0
02 Oct 2022
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition
Jianhao Ma
Li-Zhen Guo
S. Fattahi
38
4
0
01 Oct 2022
Rethinking skip connection model as a learnable Markov chain
Dengsheng Chen
Jie Hu
Wenwen Qiang
Xiaoming Wei
Enhua Wu
BDL
24
1
0
30 Sep 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Alireza Mousavi-Hosseini
Sejun Park
M. Girotti
Ioannis Mitliagkas
Murat A. Erdogdu
MLT
324
48
0
29 Sep 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
96
6
0
27 Sep 2022
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty
Thomas George
Guillaume Lajoie
A. Baratin
31
5
0
19 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in
1
d
1d
1
d
R. Gentile
G. Welper
ODL
56
6
0
17 Sep 2022
Flashlight: Scalable Link Prediction with Effective Decoders
Yiwei Wang
Bryan Hooi
Yozen Liu
Tong Zhao
Zhichun Guo
Neil Shah
BDL
16
6
0
17 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
V. Cevher
39
19
0
15 Sep 2022
Generalization Properties of NAS under Activation and Skip Connection Search
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
V. Cevher
AI4CE
30
15
0
15 Sep 2022
On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descent
Selina Drews
Michael Kohler
30
13
0
30 Aug 2022
Towards Understanding Mixture of Experts in Deep Learning
Zixiang Chen
Yihe Deng
Yue-bo Wu
Quanquan Gu
Yuan-Fang Li
MLT
MoE
42
53
0
04 Aug 2022
BiFeat: Supercharge GNN Training via Graph Feature Quantization
Yuxin Ma
Ping Gong
Jun Yi
Z. Yao
Cheng-rong Li
Yuxiong He
Feng Yan
GNN
21
6
0
29 Jul 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit
Boaz Barak
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
39
123
0
18 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
42
27
0
08 Jul 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity
Jianyi Yang
Shaolei Ren
32
3
0
02 Jul 2022
q-Learning in Continuous Time
Yanwei Jia
X. Zhou
OffRL
51
69
0
02 Jul 2022
Neural Networks can Learn Representations with Gradient Descent
Alexandru Damian
Jason D. Lee
Mahdi Soltanolkotabi
SSL
MLT
25
114
0
30 Jun 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis
Alexander Munteanu
Simon Omlor
Zhao Song
David P. Woodruff
33
15
0
26 Jun 2022
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling
Jiri Hron
Roman Novak
Jeffrey Pennington
Jascha Narain Sohl-Dickstein
UQCV
BDL
48
6
0
15 Jun 2022
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
45
71
0
14 Jun 2022
Scaling ResNets in the Large-depth Regime
Pierre Marion
Adeline Fermanian
Gérard Biau
Jean-Philippe Vert
26
16
0
14 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
32
2
0
13 Jun 2022
Wavelet Regularization Benefits Adversarial Training
Jun Yan
Huilin Yin
Xiaoyang Deng
Zi-qin Zhao
Wancheng Ge
Hao Zhang
Gerhard Rigoll
AAML
19
2
0
08 Jun 2022
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Eshaan Nichani
Yunzhi Bai
Jason D. Lee
29
10
0
08 Jun 2022
Non-convex online learning via algorithmic equivalence
Udaya Ghai
Zhou Lu
Elad Hazan
14
8
0
30 May 2022
Global Convergence of Over-parameterized Deep Equilibrium Models
Zenan Ling
Xingyu Xie
Qiuhao Wang
Zongpeng Zhang
Zhouchen Lin
32
12
0
27 May 2022
Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width
Hanxu Zhou
Qixuan Zhou
Zhenyuan Jin
Tao Luo
Yaoyu Zhang
Zhi-Qin John Xu
25
20
0
24 May 2022
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture
Libin Zhu
Chaoyue Liu
M. Belkin
GNN
AI4CE
23
4
0
24 May 2022
Randomly Initialized One-Layer Neural Networks Make Data Linearly Separable
Promit Ghosal
Srinath Mahankali
Yihang Sun
MLT
29
4
0
24 May 2022
Gaussian Pre-Activations in Neural Networks: Myth or Reality?
Pierre Wolinski
Julyan Arbel
AI4CE
76
8
0
24 May 2022
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks
Blake Bordelon
Cengiz Pehlevan
MLT
40
77
0
19 May 2022
Robust Deep Neural Network Estimation for Multi-dimensional Functional Data
Shuoyang Wang
Guanqun Cao
3DPC
OOD
27
4
0
19 May 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
Itay Safran
Gal Vardi
Jason D. Lee
MLT
59
23
0
18 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
32
34
0
12 May 2022
Bridging Model-based Safety and Model-free Reinforcement Learning through System Identification of Low Dimensional Linear Models
Zhongyu Li
Jun Zeng
A. Thirugnanam
Koushil Sreenath
29
16
0
11 May 2022
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis
Wuyang Chen
Wei Huang
Xinyu Gong
Boris Hanin
Zhangyang Wang
35
7
0
11 May 2022
Beyond Lipschitz: Sharp Generalization and Excess Risk Bounds for Full-Batch GD
Konstantinos E. Nikolakakis
Farzin Haddadpour
Amin Karbasi
Dionysios S. Kalogerias
43
17
0
26 Apr 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes
Chao Ma
D. Kunin
Lei Wu
Lexing Ying
25
27
0
24 Apr 2022
On Feature Learning in Neural Networks with Global Convergence Guarantees
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
36
13
0
22 Apr 2022
On Convergence Lemma and Convergence Stability for Piecewise Analytic Functions
Xiaotie Deng
Hanyu Li
Ningyuan Li
15
0
0
04 Apr 2022
Training Fully Connected Neural Networks is
∃
R
\exists\mathbb{R}
∃
R
-Complete
Daniel Bertschinger
Christoph Hertrich
Paul Jungeblut
Tillmann Miltzow
Simon Weber
OffRL
61
30
0
04 Apr 2022
Convergence of gradient descent for deep neural networks
S. Chatterjee
ODL
21
20
0
30 Mar 2022
Random matrix analysis of deep neural network weight matrices
M. Thamm
Max Staats
B. Rosenow
35
12
0
28 Mar 2022
Previous
1
2
3
4
5
6
7
8
Next