Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.04688
Cited By
An Improved Analysis of Training Over-parameterized Deep Neural Networks
11 June 2019
Difan Zou
Quanquan Gu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Improved Analysis of Training Over-parameterized Deep Neural Networks"
50 / 56 papers shown
Title
Training NTK to Generalize with KARE
Johannes Schwab
Bryan Kelly
Semyon Malamud
Teng Andrea Xu
11
0
0
16 May 2025
High-entropy Advantage in Neural Networks' Generalizability
Entao Yang
Jiahui Geng
Yue Shang
Ge Zhang
AI4CE
66
0
0
17 Mar 2025
Feature Learning Beyond the Edge of Stability
Dávid Terjék
MLT
48
0
0
18 Feb 2025
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits
H. Bui
Enrique Mallada
Anqi Liu
192
0
0
08 Nov 2024
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes
Nikita Kiselev
Andrey Grabovoy
54
1
0
18 Sep 2024
Sparse Deep Learning for Time Series Data: Theory and Applications
Mingxuan Zhang
Y. Sun
Faming Liang
AI4TS
OOD
BDL
41
2
0
05 Oct 2023
How to Protect Copyright Data in Optimization of Large Language Models?
T. Chu
Zhao Song
Chiwun Yang
45
29
0
23 Aug 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification
Lianke Qin
Zhao Song
Yuanyuan Yang
30
9
0
13 Jul 2023
Considering Layerwise Importance in the Lottery Ticket Hypothesis
Benjamin Vandersmissen
José Oramas
37
1
0
22 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
Ming Wang
Sijia Liu
Pin-Yu Chen
ViT
MLT
37
57
0
12 Feb 2023
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning
François Caron
Fadhel Ayed
Paul Jung
Hoileong Lee
Juho Lee
Hongseok Yang
67
2
0
02 Feb 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
53
11
0
30 Dec 2022
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing
Josh Alman
Jiehao Liang
Zhao Song
Ruizhe Zhang
Danyang Zhuo
84
31
0
25 Nov 2022
Characterizing the Spectrum of the NTK via a Power Series Expansion
Michael Murray
Hui Jin
Benjamin Bowman
Guido Montúfar
40
11
0
15 Nov 2022
When Expressivity Meets Trainability: Fewer than
n
n
n
Neurons Can Work
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Ruoyu Sun
Zhi-Quan Luo
31
10
0
21 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets
Pulkit Gopalani
Anirbit Mukherjee
26
5
0
20 Oct 2022
On skip connections and normalisation layers in deep optimisation
L. MacDonald
Jack Valmadre
Hemanth Saratchandran
Simon Lucey
ODL
34
1
0
10 Oct 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in
1
d
1d
1
d
R. Gentile
G. Welper
ODL
56
6
0
17 Sep 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity
Jianyi Yang
Shaolei Ren
32
3
0
02 Jul 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis
Alexander Munteanu
Simon Omlor
Zhao Song
David P. Woodruff
33
15
0
26 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
32
2
0
13 Jun 2022
Global Convergence of Over-parameterized Deep Equilibrium Models
Zenan Ling
Xingyu Xie
Qiuhao Wang
Zongpeng Zhang
Zhouchen Lin
34
12
0
27 May 2022
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture
Libin Zhu
Chaoyue Liu
M. Belkin
GNN
AI4CE
23
4
0
24 May 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
35
3
0
28 Jan 2022
A Kernel-Expanded Stochastic Neural Network
Y. Sun
F. Liang
25
5
0
14 Jan 2022
Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks
Benjamin Bowman
Guido Montúfar
28
11
0
12 Jan 2022
Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time
Zhao Song
Licheng Zhang
Ruizhe Zhang
32
64
0
14 Dec 2021
SGD Through the Lens of Kolmogorov Complexity
Gregory Schwartzman
39
1
0
10 Nov 2021
Subquadratic Overparameterization for Shallow Neural Networks
Chaehwan Song
Ali Ramezani-Kebrya
Thomas Pethick
Armin Eftekhari
V. Cevher
30
31
0
02 Nov 2021
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks
Shuai Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
UQCV
MLT
31
13
0
12 Oct 2021
A global convergence theory for deep ReLU implicit networks via over-parameterization
Tianxiang Gao
Hailiang Liu
Jia Liu
Hridesh Rajan
Hongyang Gao
MLT
36
16
0
11 Oct 2021
Does Preprocessing Help Training Over-parameterized Neural Networks?
Zhao Song
Shuo Yang
Ruizhe Zhang
38
49
0
09 Oct 2021
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
Difan Zou
Yuan Cao
Yuanzhi Li
Quanquan Gu
MLT
AI4CE
47
39
0
25 Aug 2021
What can linearized neural networks actually say about generalization?
Guillermo Ortiz-Jiménez
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
29
44
0
12 Jun 2021
Understanding Overparameterization in Generative Adversarial Networks
Yogesh Balaji
M. Sajedi
Neha Kalibhat
Mucong Ding
Dominik Stöger
Mahdi Soltanolkotabi
S. Feizi
AI4CE
22
21
0
12 Apr 2021
On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths
Quynh N. Nguyen
53
48
0
24 Jan 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks
Asaf Noy
Yi Tian Xu
Y. Aflalo
Lihi Zelnik-Manor
Rong Jin
41
3
0
12 Jan 2021
Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise
Spencer Frei
Yuan Cao
Quanquan Gu
FedML
MLT
70
19
0
04 Jan 2021
Understanding and Increasing Efficiency of Frank-Wolfe Adversarial Training
Theodoros Tsiligkaridis
Jay Roberts
AAML
22
11
0
22 Dec 2020
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks
Quynh N. Nguyen
Marco Mondelli
Guido Montúfar
25
81
0
21 Dec 2020
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks
Zhiqi Bu
Shiyun Xu
Kan Chen
33
17
0
25 Oct 2020
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders
Yibo Jiang
Cengiz Pehlevan
19
13
0
30 Jun 2020
Logarithmic Pruning is All You Need
Laurent Orseau
Marcus Hutter
Omar Rivasplata
28
88
0
22 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time
Jan van den Brand
Binghui Peng
Zhao Song
Omri Weinstein
ODL
29
82
0
20 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng Zhang
Qi Cai
Zhuoran Yang
Yongxin Chen
Zhaoran Wang
OOD
MLT
153
11
0
08 Jun 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
MLT
AAML
39
147
0
20 May 2020
Provable Training of a ReLU Gate with an Iterative Non-Gradient Algorithm
Sayar Karmakar
Anirbit Mukherjee
14
7
0
08 May 2020
Learning Parities with Neural Networks
Amit Daniely
Eran Malach
24
76
0
18 Feb 2020
Convergence of End-to-End Training in Deep Unsupervised Contrastive Learning
Zixin Wen
SSL
21
2
0
17 Feb 2020
Memory capacity of neural networks with threshold and ReLU activations
Roman Vershynin
31
21
0
20 Jan 2020
1
2
Next