ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.04918
  4. Cited By
Learning and Generalization in Overparameterized Neural Networks, Going
  Beyond Two Layers

Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers

12 November 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
    MLT
ArXivPDFHTML

Papers citing "Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers"

50 / 498 papers shown
Title
Training Neural Networks with Fixed Sparse Masks
Training Neural Networks with Fixed Sparse Masks
Yi-Lin Sung
Varun Nair
Colin Raffel
FedML
32
197
0
18 Nov 2021
Dynamics of Local Elasticity During Training of Neural Nets
Dynamics of Local Elasticity During Training of Neural Nets
Soham Dan
Anirbit Mukherjee
Avirup Das
Phanideep Gampa
25
0
0
01 Nov 2021
ADDS: Adaptive Differentiable Sampling for Robust Multi-Party Learning
ADDS: Adaptive Differentiable Sampling for Robust Multi-Party Learning
Maoguo Gong
Yuan Gao
Yue Wu
•. A. K. Qin
FedML
OOD
6
1
0
29 Oct 2021
Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity
  Bias
Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias
Kaifeng Lyu
Zhiyuan Li
Runzhe Wang
Sanjeev Arora
MLT
39
70
0
26 Oct 2021
Quantifying Epistemic Uncertainty in Deep Learning
Quantifying Epistemic Uncertainty in Deep Learning
Ziyi Huang
H. Lam
Haofeng Zhang
UQCV
BDL
UD
PER
24
12
0
23 Oct 2021
Conditionally Gaussian PAC-Bayes
Conditionally Gaussian PAC-Bayes
Eugenio Clerico
George Deligiannidis
Arnaud Doucet
37
10
0
22 Oct 2021
Wide Neural Networks Forget Less Catastrophically
Wide Neural Networks Forget Less Catastrophically
Seyed Iman Mirzadeh
Arslan Chaudhry
Dong Yin
Huiyi Hu
Razvan Pascanu
Dilan Görür
Mehrdad Farajtabar
CLL
33
60
0
21 Oct 2021
On Reward-Free RL with Kernel and Neural Function Approximations:
  Single-Agent MDP and Markov Game
On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game
Shuang Qiu
Jieping Ye
Zhaoran Wang
Zhuoran Yang
OffRL
37
23
0
19 Oct 2021
Path Regularization: A Convexity and Sparsity Inducing Regularization
  for Parallel ReLU Networks
Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks
Tolga Ergen
Mert Pilanci
32
16
0
18 Oct 2021
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
Zhiyuan Li
Tianhao Wang
Sanjeev Arora
MLT
90
99
0
13 Oct 2021
Exploring Heterogeneous Characteristics of Layers in ASR Models for More
  Efficient Training
Exploring Heterogeneous Characteristics of Layers in ASR Models for More Efficient Training
Lillian Zhou
Dhruv Guliani
Andreas Kabel
Giovanni Motta
F. Beaufays
26
1
0
08 Oct 2021
Efficient and Private Federated Learning with Partially Trainable
  Networks
Efficient and Private Federated Learning with Partially Trainable Networks
Hakim Sidahmed
Zheng Xu
Ankush Garg
Yuan Cao
Mingqing Chen
FedML
49
13
0
06 Oct 2021
Random matrices in service of ML footprint: ternary random features with
  no performance loss
Random matrices in service of ML footprint: ternary random features with no performance loss
Hafiz Tiomoko Ali
Zhenyu Liao
Romain Couillet
44
7
0
05 Oct 2021
On the Provable Generalization of Recurrent Neural Networks
On the Provable Generalization of Recurrent Neural Networks
Lifu Wang
Bo Shen
Bo Hu
Xing Cao
39
8
0
29 Sep 2021
Theory of overparametrization in quantum neural networks
Theory of overparametrization in quantum neural networks
Martín Larocca
Nathan Ju
Diego García-Martín
Patrick J. Coles
M. Cerezo
52
188
0
23 Sep 2021
Comparing Text Representations: A Theory-Driven Approach
Comparing Text Representations: A Theory-Driven Approach
Gregory Yauney
David M. Mimno
26
6
0
15 Sep 2021
Patch-based Medical Image Segmentation using Matrix Product State Tensor
  Networks
Patch-based Medical Image Segmentation using Matrix Product State Tensor Networks
Raghavendra Selvan
Erik Dam
Soren Alexander Flensborg
Jens Petersen
MedIm
35
2
0
15 Sep 2021
Single-stream CNN with Learnable Architecture for Multi-source Remote
  Sensing Data
Single-stream CNN with Learnable Architecture for Multi-source Remote Sensing Data
Yi Yang
Daoye Zhu
Tengteng Qu
Qiangyu Wang
Fuhu Ren
Chengqi Cheng
37
19
0
13 Sep 2021
Understanding the Generalization of Adam in Learning Neural Networks
  with Proper Regularization
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
Difan Zou
Yuan Cao
Yuanzhi Li
Quanquan Gu
MLT
AI4CE
47
39
0
25 Aug 2021
A Scaling Law for Synthetic-to-Real Transfer: How Much Is Your
  Pre-training Effective?
A Scaling Law for Synthetic-to-Real Transfer: How Much Is Your Pre-training Effective?
Hiroaki Mikami
Kenji Fukumizu
Shogo Murai
Shuji Suzuki
Yuta Kikuchi
Taiji Suzuki
S. Maeda
Kohei Hayashi
40
12
0
25 Aug 2021
Revealing the Distributional Vulnerability of Discriminators by Implicit
  Generators
Revealing the Distributional Vulnerability of Discriminators by Implicit Generators
Zhilin Zhao
LongBing Cao
Kun-Yu Lin
29
11
0
23 Aug 2021
Boosting of Head Pose Estimation by Knowledge Distillation
Boosting of Head Pose Estimation by Knowledge Distillation
A. Sheka
V. Samun
13
0
0
20 Aug 2021
Implicit Regularization of Bregman Proximal Point Algorithm and Mirror
  Descent on Separable Data
Implicit Regularization of Bregman Proximal Point Algorithm and Mirror Descent on Separable Data
Yan Li
Caleb Ju
Ethan X. Fang
T. Zhao
17
9
0
15 Aug 2021
A proof of convergence for the gradient descent optimization method with
  random initializations in the training of neural networks with ReLU
  activation for piecewise linear target functions
A proof of convergence for the gradient descent optimization method with random initializations in the training of neural networks with ReLU activation for piecewise linear target functions
Arnulf Jentzen
Adrian Riekert
33
13
0
10 Aug 2021
How much pre-training is enough to discover a good subnetwork?
How much pre-training is enough to discover a good subnetwork?
Cameron R. Wolfe
Fangshuo Liao
Qihan Wang
J. Kim
Anastasios Kyrillidis
30
3
0
31 Jul 2021
Deep Networks Provably Classify Data on Curves
Deep Networks Provably Classify Data on Curves
Tingran Wang
Sam Buchanan
D. Gilboa
John N. Wright
23
9
0
29 Jul 2021
Local SGD Optimizes Overparameterized Neural Networks in Polynomial Time
Local SGD Optimizes Overparameterized Neural Networks in Polynomial Time
Yuyang Deng
Mohammad Mahdi Kamani
M. Mahdavi
FedML
19
14
0
22 Jul 2021
Convergence rates for shallow neural networks learned by gradient
  descent
Convergence rates for shallow neural networks learned by gradient descent
Alina Braun
Michael Kohler
S. Langer
Harro Walk
22
10
0
20 Jul 2021
Inverse Problem of Nonlinear Schrödinger Equation as Learning of
  Convolutional Neural Network
Inverse Problem of Nonlinear Schrödinger Equation as Learning of Convolutional Neural Network
Yiran Wang
Zhen Li
26
2
0
19 Jul 2021
SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs
SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs
Satyen Kale
Ayush Sekhari
Karthik Sridharan
196
29
0
11 Jul 2021
The Values Encoded in Machine Learning Research
The Values Encoded in Machine Learning Research
Abeba Birhane
Pratyusha Kalluri
Dallas Card
William Agnew
Ravit Dotan
Michelle Bao
41
274
0
29 Jun 2021
Proxy Convexity: A Unified Framework for the Analysis of Neural Networks
  Trained by Gradient Descent
Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent
Spencer Frei
Quanquan Gu
26
26
0
25 Jun 2021
On the Cryptographic Hardness of Learning Single Periodic Neurons
On the Cryptographic Hardness of Learning Single Periodic Neurons
M. Song
Ilias Zadik
Joan Bruna
AAML
30
27
0
20 Jun 2021
Wide stochastic networks: Gaussian limit and PAC-Bayesian training
Wide stochastic networks: Gaussian limit and PAC-Bayesian training
Eugenio Clerico
George Deligiannidis
Arnaud Doucet
25
12
0
17 Jun 2021
Understanding Deflation Process in Over-parametrized Tensor
  Decomposition
Understanding Deflation Process in Over-parametrized Tensor Decomposition
Rong Ge
Y. Ren
Xiang Wang
Mo Zhou
25
17
0
11 Jun 2021
The Limitations of Large Width in Neural Networks: A Deep Gaussian
  Process Perspective
The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective
Geoff Pleiss
John P. Cunningham
28
24
0
11 Jun 2021
Towards Understanding Generalization via Decomposing Excess Risk
  Dynamics
Towards Understanding Generalization via Decomposing Excess Risk Dynamics
Jiaye Teng
Jianhao Ma
Yang Yuan
29
4
0
11 Jun 2021
Early-stopped neural networks are consistent
Early-stopped neural networks are consistent
Ziwei Ji
Justin D. Li
Matus Telgarsky
19
37
0
10 Jun 2021
The dilemma of quantum neural networks
The dilemma of quantum neural networks
Yan Qian
Xinbiao Wang
Yuxuan Du
Xingyao Wu
Dacheng Tao
21
30
0
09 Jun 2021
Submodular + Concave
Submodular + Concave
Siddharth Mitra
Moran Feldman
Amin Karbasi
11
18
0
09 Jun 2021
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width
  Limit at Initialization
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization
Mufan Li
Mihai Nica
Daniel M. Roy
32
33
0
07 Jun 2021
Heavy Tails in SGD and Compressibility of Overparametrized Neural
  Networks
Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks
Melih Barsbey
Romain Chor
Murat A. Erdogdu
Gaël Richard
Umut Simsekli
22
41
0
07 Jun 2021
An Even More Optimal Stochastic Optimization Algorithm: Minibatching and
  Interpolation Learning
An Even More Optimal Stochastic Optimization Algorithm: Minibatching and Interpolation Learning
Blake E. Woodworth
Nathan Srebro
17
21
0
04 Jun 2021
Learning and Generalization in RNNs
Learning and Generalization in RNNs
A. Panigrahi
Navin Goyal
27
3
0
31 May 2021
Toward Understanding the Feature Learning Process of Self-supervised
  Contrastive Learning
Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning
Zixin Wen
Yuanzhi Li
SSL
MLT
32
131
0
31 May 2021
Characterization of Generalizability of Spike Timing Dependent
  Plasticity trained Spiking Neural Networks
Characterization of Generalizability of Spike Timing Dependent Plasticity trained Spiking Neural Networks
Biswadeep Chakraborty
Saibal Mukhopadhyay
12
15
0
31 May 2021
AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on
  the Fly
AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
Yuchen Jin
Dinesh Manocha
Liangyu Zhao
Yibo Zhu
Chuanxiong Guo
Marco Canini
Arvind Krishnamurthy
37
18
0
22 May 2021
The Dynamics of Gradient Descent for Overparametrized Neural Networks
The Dynamics of Gradient Descent for Overparametrized Neural Networks
Siddhartha Satpathi
R. Srikant
MLT
AI4CE
24
13
0
13 May 2021
Convergence and Implicit Bias of Gradient Flow on Overparametrized
  Linear Networks
Convergence and Implicit Bias of Gradient Flow on Overparametrized Linear Networks
Hancheng Min
Salma Tarmoun
René Vidal
Enrique Mallada
MLT
19
5
0
13 May 2021
Principal Components Bias in Over-parameterized Linear Models, and its
  Manifestation in Deep Neural Networks
Principal Components Bias in Over-parameterized Linear Models, and its Manifestation in Deep Neural Networks
Guy Hacohen
D. Weinshall
18
10
0
12 May 2021
Previous
123456...8910
Next