Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1808.01204
Cited By
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data
3 August 2018
Yuanzhi Li
Yingyu Liang
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data"
50 / 151 papers shown
Title
Information-theoretic reduction of deep neural networks to linear models in the overparametrized proportional regime
Francesco Camilli
D. Tieplova
Eleonora Bergamin
Jean Barbier
138
0
0
06 May 2025
On the Cone Effect in the Learning Dynamics
Zhanpeng Zhou
Yongyi Yang
Jie Ren
Mahito Sugiyama
Junchi Yan
53
0
0
20 Mar 2025
LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation
Can Jin
Ying Li
Mingyu Zhao
Shiyu Zhao
Zhenting Wang
Xiaoxiao He
Ligong Han
Tong Che
Dimitris N. Metaxas
VPVLM
VLM
124
1
0
02 Feb 2025
Extended convexity and smoothness and their applications in deep learning
Binchuan Qi
Wei Gong
Li Li
63
0
0
08 Oct 2024
Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption Robustness
Boqian Wu
Q. Xiao
Shunxin Wang
N. Strisciuglio
Mykola Pechenizkiy
M. V. Keulen
Decebal Constantin Mocanu
Elena Mocanu
OOD
3DH
57
0
0
03 Oct 2024
How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance
Hongkang Li
Shuai Zhang
Yihua Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
41
4
0
12 Mar 2024
GD doesn't make the cut: Three ways that non-differentiability affects neural network training
Siddharth Krishna Kumar
AAML
18
2
0
16 Jan 2024
Weak Correlations as the Underlying Principle for Linearization of Gradient-Based Learning Systems
Ori Shem-Ur
Yaron Oz
19
0
0
08 Jan 2024
\emph{Lifted} RDT based capacity analysis of the 1-hidden layer treelike \emph{sign} perceptrons neural networks
M. Stojnic
24
1
0
13 Dec 2023
Capacity of the treelike sign perceptrons neural networks with one hidden layer -- RDT based upper bounds
M. Stojnic
21
4
0
13 Dec 2023
How to Protect Copyright Data in Optimization of Large Language Models?
T. Chu
Zhao Song
Chiwun Yang
40
29
0
23 Aug 2023
Federated Semi-Supervised and Semi-Asynchronous Learning for Anomaly Detection in IoT Networks
Wenbin Zhai
Feng Wang
L. Liu
Youwei Ding
Wanyi Lu
32
0
0
23 Aug 2023
Understanding Deep Neural Networks via Linear Separability of Hidden Layers
Chao Zhang
Xinyuan Chen
Wensheng Li
Lixue Liu
Wei Wu
Dacheng Tao
28
3
0
26 Jul 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification
Lianke Qin
Zhao Song
Yuanyuan Yang
25
9
0
13 Jul 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks
Puyu Wang
Yunwen Lei
Di Wang
Yiming Ying
Ding-Xuan Zhou
MLT
29
4
0
26 May 2023
SketchOGD: Memory-Efficient Continual Learning
Benjamin Wright
Youngjae Min
Jeremy Bernstein
Navid Azizan
CLL
28
0
0
25 May 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
Hossein Taheri
Christos Thrampoulidis
MLT
16
3
0
22 May 2023
On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains
Yicheng Li
Zixiong Yu
Y. Cotronis
Qian Lin
55
13
0
04 May 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu
S. Du
37
16
0
20 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
Hao Wu
Sijia Liu
Pin-Yu Chen
ViT
MLT
37
57
0
12 Feb 2023
Global Convergence Rate of Deep Equilibrium Models with General Activations
Lan V. Truong
39
2
0
11 Feb 2023
A Survey on Efficient Training of Transformers
Bohan Zhuang
Jing Liu
Zizheng Pan
Haoyu He
Yuetian Weng
Chunhua Shen
31
47
0
02 Feb 2023
Convergence beyond the over-parameterized regime using Rayleigh quotients
David A. R. Robin
Kevin Scaman
Marc Lelarge
27
3
0
19 Jan 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
53
11
0
30 Dec 2022
Enhancing Neural Network Differential Equation Solvers
Matthew J. H. Wright
21
0
0
28 Dec 2022
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing
Josh Alman
Jiehao Liang
Zhao Song
Ruizhe Zhang
Danyang Zhuo
77
31
0
25 Nov 2022
Understanding the double descent curve in Machine Learning
Luis Sa-Couto
J. M. Ramos
Miguel Almeida
Andreas Wichert
35
1
0
18 Nov 2022
Cold Start Streaming Learning for Deep Networks
Cameron R. Wolfe
Anastasios Kyrillidis
CLL
20
2
0
09 Nov 2022
Do highly over-parameterized neural networks generalize since bad solutions are rare?
Julius Martinetz
T. Martinetz
30
1
0
07 Nov 2022
Why neural networks find simple solutions: the many regularizers of geometric complexity
Benoit Dherin
Michael Munn
M. Rosca
David Barrett
55
31
0
27 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in
1
d
1d
1
d
R. Gentile
G. Welper
ODL
52
6
0
17 Sep 2022
Differentially Private Stochastic Gradient Descent with Low-Noise
Puyu Wang
Yunwen Lei
Yiming Ying
Ding-Xuan Zhou
FedML
46
5
0
09 Sep 2022
Intersection of Parallels as an Early Stopping Criterion
Ali Vardasbi
Maarten de Rijke
Mostafa Dehghani
MoMe
38
5
0
19 Aug 2022
Towards Understanding Mixture of Experts in Deep Learning
Zixiang Chen
Yihe Deng
Yue-bo Wu
Quanquan Gu
Yuan-Fang Li
MLT
MoE
30
53
0
04 Aug 2022
Neural Networks can Learn Representations with Gradient Descent
Alexandru Damian
Jason D. Lee
Mahdi Soltanolkotabi
SSL
MLT
22
114
0
30 Jun 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis
Alexander Munteanu
Simon Omlor
Zhao Song
David P. Woodruff
30
15
0
26 Jun 2022
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
40
70
0
14 Jun 2022
Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks
Kaiqi Zhang
Ming Yin
Yu-Xiang Wang
MQ
24
4
0
13 Jun 2022
Meet You Halfway: Explaining Deep Learning Mysteries
Oriel BenShmuel
AAML
FedML
FAtt
OOD
27
0
0
09 Jun 2022
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Eshaan Nichani
Yunzhi Bai
Jason D. Lee
29
10
0
08 Jun 2022
Global Convergence of Over-parameterized Deep Equilibrium Models
Zenan Ling
Xingyu Xie
Qiuhao Wang
Zongpeng Zhang
Zhouchen Lin
32
12
0
27 May 2022
One-Pixel Shortcut: on the Learning Preference of Deep Neural Networks
Shutong Wu
Sizhe Chen
Cihang Xie
X. Huang
AAML
45
27
0
24 May 2022
Randomly Initialized One-Layer Neural Networks Make Data Linearly Separable
Promit Ghosal
Srinath Mahankali
Yihang Sun
MLT
26
4
0
24 May 2022
On Feature Learning in Neural Networks with Global Convergence Guarantees
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
36
13
0
22 Apr 2022
Understanding the unstable convergence of gradient descent
Kwangjun Ahn
J.N. Zhang
S. Sra
33
57
0
03 Apr 2022
Convergence of gradient descent for deep neural networks
S. Chatterjee
ODL
21
20
0
30 Mar 2022
Explicitising The Implicit Intrepretability of Deep Neural Networks Via Duality
Chandrashekar Lakshminarayanan
Ashutosh Kumar Singh
A. Rajkumar
AI4CE
26
1
0
01 Mar 2022
On Regularizing Coordinate-MLPs
Sameera Ramasinghe
L. MacDonald
Simon Lucey
158
5
0
01 Feb 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
33
3
0
28 Jan 2022
How does unlabeled data improve generalization in self-training? A one-hidden-layer theoretical analysis
Shuai Zhang
Hao Wu
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
SSL
MLT
41
22
0
21 Jan 2022
1
2
3
4
Next