ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.04918
  4. Cited By
Learning and Generalization in Overparameterized Neural Networks, Going
  Beyond Two Layers

Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers

12 November 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
    MLT
ArXivPDFHTML

Papers citing "Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers"

50 / 498 papers shown
Title
A Survey on Efficient Training of Transformers
A Survey on Efficient Training of Transformers
Bohan Zhuang
Jing Liu
Zizheng Pan
Haoyu He
Yuetian Weng
Chunhua Shen
31
47
0
02 Feb 2023
Gradient Descent in Neural Networks as Sequential Learning in RKBS
Gradient Descent in Neural Networks as Sequential Learning in RKBS
A. Shilton
Sunil R. Gupta
Santu Rana
Svetha Venkatesh
MLT
19
1
0
01 Feb 2023
A Novel Framework for Policy Mirror Descent with General
  Parameterization and Linear Convergence
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence
Carlo Alfano
Rui Yuan
Patrick Rebeschini
65
15
0
30 Jan 2023
Norm-based Generalization Bounds for Compositionally Sparse Neural
  Networks
Norm-based Generalization Bounds for Compositionally Sparse Neural Networks
Tomer Galanti
Mengjia Xu
Liane Galanti
T. Poggio
38
9
0
28 Jan 2023
FedExP: Speeding Up Federated Averaging via Extrapolation
FedExP: Speeding Up Federated Averaging via Extrapolation
Divyansh Jhunjhunwala
Shiqiang Wang
Gauri Joshi
FedML
21
53
0
23 Jan 2023
An Analysis of Attention via the Lens of Exchangeability and Latent
  Variable Models
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
53
11
0
30 Dec 2022
Bayesian Interpolation with Deep Linear Networks
Bayesian Interpolation with Deep Linear Networks
Boris Hanin
Alexander Zlokapa
42
25
0
29 Dec 2022
Problem-Dependent Power of Quantum Neural Networks on Multi-Class
  Classification
Problem-Dependent Power of Quantum Neural Networks on Multi-Class Classification
Yuxuan Du
Yibo Yang
Dacheng Tao
Min-hsiu Hsieh
43
23
0
29 Dec 2022
Learning Lipschitz Functions by GD-trained Shallow Overparameterized
  ReLU Neural Networks
Learning Lipschitz Functions by GD-trained Shallow Overparameterized ReLU Neural Networks
Ilja Kuzborskij
Csaba Szepesvári
21
4
0
28 Dec 2022
COLT: Cyclic Overlapping Lottery Tickets for Faster Pruning of Convolutional Neural Networks
COLT: Cyclic Overlapping Lottery Tickets for Faster Pruning of Convolutional Neural Networks
Md. Ismail Hossain
Mohammed Rakib
M. M. L. Elahi
Nabeel Mohammed
Shafin Rahman
21
1
0
24 Dec 2022
Eigenvalue initialisation and regularisation for Koopman autoencoders
Eigenvalue initialisation and regularisation for Koopman autoencoders
Jack W. Miller
Charles OÑeill
N. Constantinou
Omri Azencot
17
2
0
23 Dec 2022
Graph Neural Networks are Inherently Good Generalizers: Insights by
  Bridging GNNs and MLPs
Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs
Chenxiao Yang
Qitian Wu
Jiahua Wang
Junchi Yan
AI4CE
24
51
0
18 Dec 2022
Graph Neural Network based Child Activity Recognition
Graph Neural Network based Child Activity Recognition
Sanka Mohottala
Pradeepa Samarasinghe
D. Kasthurirathna
Charith Abhayaratne
BDL
GNN
18
5
0
18 Dec 2022
Leveraging Unlabeled Data to Track Memorization
Leveraging Unlabeled Data to Track Memorization
Mahsa Forouzesh
Hanie Sedghi
Patrick Thiran
NoLa
TDI
34
4
0
08 Dec 2022
Matching DNN Compression and Cooperative Training with Resources and
  Data Availability
Matching DNN Compression and Cooperative Training with Resources and Data Availability
F. Malandrino
G. Giacomo
Armin Karamzade
Marco Levorato
C. Chiasserini
45
9
0
02 Dec 2022
Nonlinear Advantage: Trained Networks Might Not Be As Complex as You
  Think
Nonlinear Advantage: Trained Networks Might Not Be As Complex as You Think
Christian H. X. Ali Mehmeti-Göpel
Jan Disselhoff
13
5
0
30 Nov 2022
On the Power of Foundation Models
On the Power of Foundation Models
Yang Yuan
20
36
0
29 Nov 2022
Linear RNNs Provably Learn Linear Dynamic Systems
Linear RNNs Provably Learn Linear Dynamic Systems
Lifu Wang
Tianyu Wang
Shengwei Yi
Bo Shen
Bo Hu
Xing Cao
22
0
0
19 Nov 2022
Understanding the double descent curve in Machine Learning
Understanding the double descent curve in Machine Learning
Luis Sa-Couto
J. M. Ramos
Miguel Almeida
Andreas Wichert
35
1
0
18 Nov 2022
On the symmetries in the dynamics of wide two-layer neural networks
On the symmetries in the dynamics of wide two-layer neural networks
Karl Hajjar
Lénaïc Chizat
21
11
0
16 Nov 2022
Spectral Evolution and Invariance in Linear-width Neural Networks
Spectral Evolution and Invariance in Linear-width Neural Networks
Zhichao Wang
A. Engel
Anand D. Sarwate
Ioana Dumitriu
Tony Chiang
40
14
0
11 Nov 2022
Do highly over-parameterized neural networks generalize since bad
  solutions are rare?
Do highly over-parameterized neural networks generalize since bad solutions are rare?
Julius Martinetz
T. Martinetz
30
1
0
07 Nov 2022
Sparsity in Continuous-Depth Neural Networks
Sparsity in Continuous-Depth Neural Networks
H. Aliee
Till Richter
Mikhail Solonin
I. Ibarra
Fabian J. Theis
Niki Kilbertus
29
10
0
26 Oct 2022
Pushing the Efficiency Limit Using Structured Sparse Convolutions
Pushing the Efficiency Limit Using Structured Sparse Convolutions
Vinay Kumar Verma
Nikhil Mehta
Shijing Si
Ricardo Henao
Lawrence Carin
17
3
0
23 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets
Global Convergence of SGD On Two Layer Neural Nets
Pulkit Gopalani
Anirbit Mukherjee
26
5
0
20 Oct 2022
Theoretical Guarantees for Permutation-Equivariant Quantum Neural
  Networks
Theoretical Guarantees for Permutation-Equivariant Quantum Neural Networks
Louis Schatzki
Martín Larocca
Quynh T. Nguyen
F. Sauvage
M. Cerezo
44
85
0
18 Oct 2022
A Kernel-Based View of Language Model Fine-Tuning
A Kernel-Based View of Language Model Fine-Tuning
Sadhika Malladi
Alexander Wettig
Dingli Yu
Danqi Chen
Sanjeev Arora
VLM
78
61
0
11 Oct 2022
LieGG: Studying Learned Lie Group Generators
LieGG: Studying Learned Lie Group Generators
A. Moskalev
A. Sepliarskaia
Ivan Sosnovik
A. Smeulders
28
22
0
09 Oct 2022
Analysis of the rate of convergence of an over-parametrized deep neural
  network estimate learned by gradient descent
Analysis of the rate of convergence of an over-parametrized deep neural network estimate learned by gradient descent
Michael Kohler
A. Krzyżak
32
10
0
04 Oct 2022
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis
  Function Decomposition
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition
Jianhao Ma
Li-Zhen Guo
S. Fattahi
38
4
0
01 Oct 2022
On the optimization and generalization of overparameterized implicit
  neural networks
On the optimization and generalization of overparameterized implicit neural networks
Tianxiang Gao
Hongyang Gao
MLT
AI4CE
19
3
0
30 Sep 2022
Hierarchical Sliced Wasserstein Distance
Hierarchical Sliced Wasserstein Distance
Khai Nguyen
Tongzheng Ren
Huy Nguyen
Litu Rout
T. Nguyen
Nhat Ho
33
20
0
27 Sep 2022
Stability and Generalization Analysis of Gradient Methods for Shallow
  Neural Networks
Stability and Generalization Analysis of Gradient Methods for Shallow Neural Networks
Yunwen Lei
Rong Jin
Yiming Ying
MLT
40
18
0
19 Sep 2022
Generalization Properties of NAS under Activation and Skip Connection
  Search
Generalization Properties of NAS under Activation and Skip Connection Search
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
V. Cevher
AI4CE
28
15
0
15 Sep 2022
Overparameterization from Computational Constraints
Overparameterization from Computational Constraints
Sanjam Garg
S. Jha
Saeed Mahloujifar
Mohammad Mahmoody
Mingyuan Wang
28
1
0
27 Aug 2022
Intersection of Parallels as an Early Stopping Criterion
Intersection of Parallels as an Early Stopping Criterion
Ali Vardasbi
Maarten de Rijke
Mostafa Dehghani
MoMe
41
5
0
19 Aug 2022
On the generalization of learning algorithms that do not converge
On the generalization of learning algorithms that do not converge
N. Chandramoorthy
Andreas Loukas
Khashayar Gatmiry
Stefanie Jegelka
MLT
19
11
0
16 Aug 2022
Mixed-Precision Neural Networks: A Survey
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
25
11
0
11 Aug 2022
A Sublinear Adversarial Training Algorithm
A Sublinear Adversarial Training Algorithm
Yeqi Gao
Lianke Qin
Zhao Song
Yitan Wang
GAN
36
25
0
10 Aug 2022
Training Overparametrized Neural Networks in Sublinear Time
Training Overparametrized Neural Networks in Sublinear Time
Yichuan Deng
Han Hu
Zhao Song
Omri Weinstein
Danyang Zhuo
BDL
30
28
0
09 Aug 2022
Federated Adversarial Learning: A Framework with Convergence Analysis
Federated Adversarial Learning: A Framework with Convergence Analysis
Xiaoxiao Li
Zhao Song
Jiaming Yang
FedML
27
19
0
07 Aug 2022
Towards Understanding Mixture of Experts in Deep Learning
Towards Understanding Mixture of Experts in Deep Learning
Zixiang Chen
Yihe Deng
Yue-bo Wu
Quanquan Gu
Yuan-Fang Li
MLT
MoE
42
53
0
04 Aug 2022
Utilizing Excess Resources in Training Neural Networks
Utilizing Excess Resources in Training Neural Networks
Amit Henig
Raja Giryes
53
0
0
12 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On
  Equivalence to Mirror Descent
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
42
27
0
08 Jul 2022
Learning and generalization of one-hidden-layer neural networks, going
  beyond standard Gaussian data
Learning and generalization of one-hidden-layer neural networks, going beyond standard Gaussian data
Hongkang Li
Shuai Zhang
Ming Wang
MLT
21
8
0
07 Jul 2022
Generalization Guarantee of Training Graph Convolutional Networks with
  Graph Topology Sampling
Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling
Hongkang Li
Ming Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
GNN
16
27
0
07 Jul 2022
Predicting Out-of-Domain Generalization with Neighborhood Invariance
Predicting Out-of-Domain Generalization with Neighborhood Invariance
Nathan Ng
Neha Hulkund
Kyunghyun Cho
Marzyeh Ghassemi
OOD
27
4
0
05 Jul 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization
  and Sampling Complexity
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity
Jianyi Yang
Shaolei Ren
32
3
0
02 Jul 2022
Neural Networks can Learn Representations with Gradient Descent
Neural Networks can Learn Representations with Gradient Descent
Alexandru Damian
Jason D. Lee
Mahdi Soltanolkotabi
SSL
MLT
25
114
0
30 Jun 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A
  Worst Case Analysis
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis
Alexander Munteanu
Simon Omlor
Zhao Song
David P. Woodruff
33
15
0
26 Jun 2022
Previous
123456...8910
Next