Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.03962
Cited By
A Convergence Theory for Deep Learning via Over-Parameterization
9 November 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CE
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Convergence Theory for Deep Learning via Over-Parameterization"
50 / 354 papers shown
Title
Dash: Semi-Supervised Learning with Dynamic Thresholding
Yi Tian Xu
Lei Shang
Jinxing Ye
Qi Qian
Yu-Feng Li
Baigui Sun
Hao Li
Rong Jin
47
218
0
01 Sep 2021
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
Difan Zou
Yuan Cao
Yuanzhi Li
Quanquan Gu
MLT
AI4CE
47
39
0
25 Aug 2021
Fast Sketching of Polynomial Kernels of Polynomial Degree
Zhao Song
David P. Woodruff
Zheng Yu
Lichen Zhang
21
40
0
21 Aug 2021
Learning Transferable Parameters for Unsupervised Domain Adaptation
Zhongyi Han
Haoliang Sun
Yilong Yin
OOD
35
45
0
13 Aug 2021
A proof of convergence for the gradient descent optimization method with random initializations in the training of neural networks with ReLU activation for piecewise linear target functions
Arnulf Jentzen
Adrian Riekert
33
13
0
10 Aug 2021
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network Approach
Haotian Gu
Xin Guo
Xiaoli Wei
Renyuan Xu
OOD
40
36
0
05 Aug 2021
Towards General Function Approximation in Zero-Sum Markov Games
Baihe Huang
Jason D. Lee
Zhaoran Wang
Zhuoran Yang
33
47
0
30 Jul 2021
Rethinking Hard-Parameter Sharing in Multi-Domain Learning
Lijun Zhang
Qizheng Yang
Xiao Liu
Hui Guan
OOD
31
14
0
23 Jul 2021
AutoFormer: Searching Transformers for Visual Recognition
Minghao Chen
Houwen Peng
Jianlong Fu
Haibin Ling
ViT
38
259
0
01 Jul 2021
The Values Encoded in Machine Learning Research
Abeba Birhane
Pratyusha Kalluri
Dallas Card
William Agnew
Ravit Dotan
Michelle Bao
41
274
0
29 Jun 2021
Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation
Haoxiang Wang
Han Zhao
Bo-wen Li
37
88
0
16 Jun 2021
A Neural Tangent Kernel Perspective of GANs
Jean-Yves Franceschi
Emmanuel de Bézenac
Ibrahim Ayed
Mickaël Chen
Sylvain Lamprier
Patrick Gallinari
37
26
0
10 Jun 2021
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization
Mufan Li
Mihai Nica
Daniel M. Roy
32
33
0
07 Jun 2021
Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning
Zixin Wen
Yuanzhi Li
SSL
MLT
32
131
0
31 May 2021
Fixed-Dimensional and Permutation Invariant State Representation of Autonomous Driving
Jingliang Duan
Dongjie Yu
Shengbo Eben Li
Wenxuan Wang
Yangang Ren
Ziyu Lin
B. Cheng
30
10
0
24 May 2021
Improved OOD Generalization via Adversarial Training and Pre-training
Mingyang Yi
Lu Hou
Jiacheng Sun
Lifeng Shang
Xin Jiang
Qun Liu
Zhi-Ming Ma
VLM
31
83
0
24 May 2021
Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions
Ameya Dilip Jagtap
Yeonjong Shin
Kenji Kawaguchi
George Karniadakis
ODL
45
131
0
20 May 2021
Global Convergence of Three-layer Neural Networks in the Mean Field Regime
H. Pham
Phan-Minh Nguyen
MLT
AI4CE
41
19
0
11 May 2021
FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis
Baihe Huang
Xiaoxiao Li
Zhao Song
Xin Yang
FedML
31
16
0
11 May 2021
A Geometric Analysis of Neural Collapse with Unconstrained Features
Zhihui Zhu
Tianyu Ding
Jinxin Zhou
Xiao Li
Chong You
Jeremias Sulam
Qing Qu
40
196
0
06 May 2021
RATT: Leveraging Unlabeled Data to Guarantee Generalization
Saurabh Garg
Sivaraman Balakrishnan
J. Zico Kolter
Zachary Chase Lipton
30
30
0
01 May 2021
Generalization Guarantees for Neural Architecture Search with Train-Validation Split
Samet Oymak
Mingchen Li
Mahdi Soltanolkotabi
AI4CE
OOD
36
13
0
29 Apr 2021
Understanding Overparameterization in Generative Adversarial Networks
Yogesh Balaji
M. Sajedi
Neha Kalibhat
Mucong Ding
Dominik Stöger
Mahdi Soltanolkotabi
S. Feizi
AI4CE
22
21
0
12 Apr 2021
A Neural Pre-Conditioning Active Learning Algorithm to Reduce Label Complexity
Seo Taek Kong
Soomin Jeon
Dongbin Na
Jaewon Lee
Honglak Lee
Kyu-Hwan Jung
23
6
0
08 Apr 2021
A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions
Arnulf Jentzen
Adrian Riekert
MLT
34
13
0
01 Apr 2021
The Discovery of Dynamics via Linear Multistep Methods and Deep Learning: Error Estimation
Q. Du
Yiqi Gu
Haizhao Yang
Chao Zhou
26
20
0
21 Mar 2021
Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
24
10
0
19 Mar 2021
GPT Understands, Too
Xiao Liu
Yanan Zheng
Zhengxiao Du
Ming Ding
Yujie Qian
Zhilin Yang
Jie Tang
VLM
87
1,146
0
18 Mar 2021
Experiments with Rich Regime Training for Deep Learning
Xinyan Li
A. Banerjee
32
2
0
26 Feb 2021
Entanglement Diagnostics for Efficient Quantum Computation
Joonho Kim
Yaron Oz
29
10
0
24 Feb 2021
On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs)
Zhiyuan Li
Sadhika Malladi
Sanjeev Arora
44
78
0
24 Feb 2021
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization
Tianyi Liu
Yan Li
S. Wei
Enlu Zhou
T. Zhao
21
13
0
24 Feb 2021
Convergence rates for gradient descent in the training of overparameterized artificial neural networks with biases
Arnulf Jentzen
T. Kröger
ODL
28
7
0
23 Feb 2021
Deep ReLU Networks Preserve Expected Length
Boris Hanin
Ryan Jeong
David Rolnick
29
14
0
21 Feb 2021
A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions
Patrick Cheridito
Arnulf Jentzen
Adrian Riekert
Florian Rossmannek
28
24
0
19 Feb 2021
FedBN: Federated Learning on Non-IID Features via Local Batch Normalization
Xiaoxiao Li
Meirui Jiang
Xiaofei Zhang
Michael Kamp
Qi Dou
OOD
FedML
168
790
0
15 Feb 2021
A Local Convergence Theory for Mildly Over-Parameterized Two-Layer Neural Network
Mo Zhou
Rong Ge
Chi Jin
76
45
0
04 Feb 2021
Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training
Cong Fang
Hangfeng He
Qi Long
Weijie J. Su
FAtt
130
167
0
29 Jan 2021
On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths
Quynh N. Nguyen
45
48
0
24 Jan 2021
Reproducing Activation Function for Deep Learning
Senwei Liang
Liyao Lyu
Chunmei Wang
Haizhao Yang
36
21
0
13 Jan 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks
Asaf Noy
Yi Tian Xu
Y. Aflalo
Lihi Zelnik-Manor
Rong Jin
41
3
0
12 Jan 2021
Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise
Spencer Frei
Yuan Cao
Quanquan Gu
FedML
MLT
64
19
0
04 Jan 2021
Understanding and Increasing Efficiency of Frank-Wolfe Adversarial Training
Theodoros Tsiligkaridis
Jay Roberts
AAML
22
11
0
22 Dec 2020
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks
Quynh N. Nguyen
Marco Mondelli
Guido Montúfar
25
81
0
21 Dec 2020
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
FedML
60
356
0
17 Dec 2020
Estimation of the Mean Function of Functional Data via Deep Neural Networks
Shuoyang Wang
Guanqun Cao
Zuofeng Shang
35
20
0
08 Dec 2020
Gradient Starvation: A Learning Proclivity in Neural Networks
Mohammad Pezeshki
Sekouba Kaba
Yoshua Bengio
Aaron Courville
Doina Precup
Guillaume Lajoie
MLT
50
258
0
18 Nov 2020
Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting
Zeke Xie
Fengxiang He
Shaopeng Fu
Issei Sato
Dacheng Tao
Masashi Sugiyama
21
60
0
12 Nov 2020
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
Zhuoran Yang
Chi Jin
Zhaoran Wang
Mengdi Wang
Michael I. Jordan
39
18
0
09 Nov 2020
Are wider nets better given the same number of parameters?
A. Golubeva
Behnam Neyshabur
Guy Gur-Ari
27
44
0
27 Oct 2020
Previous
1
2
3
4
5
6
7
8
Next