ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.04918
  4. Cited By
Learning and Generalization in Overparameterized Neural Networks, Going
  Beyond Two Layers

Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers

12 November 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
    MLT
ArXivPDFHTML

Papers citing "Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers"

50 / 498 papers shown
Title
Neural Network-Based Score Estimation in Diffusion Models: Optimization
  and Generalization
Neural Network-Based Score Estimation in Diffusion Models: Optimization and Generalization
Yinbin Han
Meisam Razaviyayn
Renyuan Xu
DiffM
51
12
0
28 Jan 2024
Improving conversion rate prediction via self-supervised pre-training in
  online advertising
Improving conversion rate prediction via self-supervised pre-training in online advertising
Alex Shtoff
Yohay Kaplan
Ariel Raviv
16
0
0
25 Jan 2024
RedEx: Beyond Fixed Representation Methods via Convex Optimization
RedEx: Beyond Fixed Representation Methods via Convex Optimization
Amit Daniely
Mariano Schain
Gilad Yehudai
27
0
0
15 Jan 2024
A Survey on Statistical Theory of Deep Learning: Approximation, Training
  Dynamics, and Generative Models
A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models
Namjoon Suh
Guang Cheng
MedIm
30
13
0
14 Jan 2024
Deep Learning With DAGs
Deep Learning With DAGs
Sourabh Vivek Balgi
Adel Daoud
Jose M. Pena
G. Wodtke
Jesse Zhou
AI4CE
CML
35
1
0
12 Jan 2024
Weak Correlations as the Underlying Principle for Linearization of
  Gradient-Based Learning Systems
Weak Correlations as the Underlying Principle for Linearization of Gradient-Based Learning Systems
Ori Shem-Ur
Yaron Oz
19
0
0
08 Jan 2024
Unraveling the Key Components of OOD Generalization via Diversification
Unraveling the Key Components of OOD Generalization via Diversification
Harold Benoit
Liangze Jiang
Andrei Atanov
Ouguzhan Fatih Kar
Mattia Rigotti
Amir Zamir
CML
31
2
0
26 Dec 2023
Resource-Limited Automated Ki67 Index Estimation in Breast Cancer
Resource-Limited Automated Ki67 Index Estimation in Breast Cancer
J. Gliozzo
Giosuè Cataldo Marinò
A. Bonometti
Marco Frasca
Dario Malchiodi
21
0
0
22 Dec 2023
A note on regularised NTK dynamics with an application to PAC-Bayesian
  training
A note on regularised NTK dynamics with an application to PAC-Bayesian training
Eugenio Clerico
Benjamin Guedj
35
0
0
20 Dec 2023
Improving the Expressive Power of Deep Neural Networks through Integral
  Activation Transform
Improving the Expressive Power of Deep Neural Networks through Integral Activation Transform
Zezhong Zhang
Feng Bao
Guannan Zhang
22
0
0
19 Dec 2023
FedEmb: A Vertical and Hybrid Federated Learning Algorithm using Network And Feature Embedding Aggregation
Fanfei Meng
Lele Zhang
Yu Chen
Yuxin Wang
FedML
19
4
0
30 Nov 2023
Critical Influence of Overparameterization on Sharpness-aware Minimization
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
47
1
0
29 Nov 2023
Learning in Deep Factor Graphs with Gaussian Belief Propagation
Learning in Deep Factor Graphs with Gaussian Belief Propagation
Seth Nabarro
Mark van der Wilk
Andrew J Davison
BDL
28
0
0
24 Nov 2023
Randomly Weighted Neuromodulation in Neural Networks Facilitates
  Learning of Manifolds Common Across Tasks
Randomly Weighted Neuromodulation in Neural Networks Facilitates Learning of Manifolds Common Across Tasks
Jinyung Hong
Theodore P. Pavlic
12
0
0
17 Nov 2023
Efficient Compression of Overparameterized Deep Models through
  Low-Dimensional Learning Dynamics
Efficient Compression of Overparameterized Deep Models through Low-Dimensional Learning Dynamics
Soo Min Kwon
Zekai Zhang
Dogyoon Song
Laura Balzano
Qing Qu
53
2
0
08 Nov 2023
Wide Neural Networks as Gaussian Processes: Lessons from Deep
  Equilibrium Models
Wide Neural Networks as Gaussian Processes: Lessons from Deep Equilibrium Models
Tianxiang Gao
Xiaokai Huo
Hailiang Liu
Hongyang Gao
BDL
25
8
0
16 Oct 2023
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing:
  The Curses of Symmetry and Initialization
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
Nuoya Xiong
Lijun Ding
Simon S. Du
48
11
0
03 Oct 2023
The Gift of Feedback: Improving ASR Model Quality by Learning from User
  Corrections through Federated Learning
The Gift of Feedback: Improving ASR Model Quality by Learning from User Corrections through Federated Learning
Lillian Zhou
Yuxin Ding
Mingqing Chen
Harry Zhang
Rohit Prabhavalkar
Dhruv Guliani
Giovanni Motta
Rajiv Mathews
6
1
0
29 Sep 2023
Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets
Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets
Pulkit Gopalani
Samyak Jha
Anirbit Mukherjee
19
2
0
17 Sep 2023
Fundamental Limits of Deep Learning-Based Binary Classifiers Trained with Hinge Loss
Fundamental Limits of Deep Learning-Based Binary Classifiers Trained with Hinge Loss
T. Getu
Georges Kaddoum
M. Bennis
40
1
0
13 Sep 2023
Modify Training Directions in Function Space to Reduce Generalization
  Error
Modify Training Directions in Function Space to Reduce Generalization Error
Yi Yu
Wenlian Lu
Boyu Chen
27
0
0
25 Jul 2023
A faster and simpler algorithm for learning shallow networks
A faster and simpler algorithm for learning shallow networks
Sitan Chen
Shyam Narayanan
41
7
0
24 Jul 2023
What can a Single Attention Layer Learn? A Study Through the Random
  Features Lens
What can a Single Attention Layer Learn? A Study Through the Random Features Lens
Hengyu Fu
Tianyu Guo
Yu Bai
Song Mei
MLT
35
22
0
21 Jul 2023
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural
  Networks
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks
Liam Collins
Hamed Hassani
Mahdi Soltanolkotabi
Aryan Mokhtari
Sanjay Shakkottai
39
10
0
13 Jul 2023
Zero-Shot Neural Architecture Search: Challenges, Solutions, and
  Opportunities
Zero-Shot Neural Architecture Search: Challenges, Solutions, and Opportunities
Guihong Li
Duc-Tuong Hoang
Kartikeya Bhardwaj
Ming Lin
Zhangyang Wang
R. Marculescu
40
10
0
05 Jul 2023
Neural Hilbert Ladders: Multi-Layer Neural Networks in Function Space
Neural Hilbert Ladders: Multi-Layer Neural Networks in Function Space
Zhengdao Chen
44
1
0
03 Jul 2023
Scaling MLPs: A Tale of Inductive Bias
Scaling MLPs: A Tale of Inductive Bias
Gregor Bachmann
Sotiris Anagnostidis
Thomas Hofmann
40
38
0
23 Jun 2023
Predicting Grokking Long Before it Happens: A look into the loss
  landscape of models which grok
Predicting Grokking Long Before it Happens: A look into the loss landscape of models which grok
Pascal Junior Tikeng Notsawo
Hattie Zhou
Mohammad Pezeshki
Irina Rish
G. Dumas
25
23
0
23 Jun 2023
Efficient Uncertainty Quantification and Reduction for
  Over-Parameterized Neural Networks
Efficient Uncertainty Quantification and Reduction for Over-Parameterized Neural Networks
Ziyi Huang
H. Lam
Haofeng Zhang
UQCV
26
4
0
09 Jun 2023
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient
  for Convolutional Neural Networks
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks
Mohammed Nowaz Rabbani Chowdhury
Shuai Zhang
Ming Wang
Sijia Liu
Pin-Yu Chen
MoE
29
17
0
07 Jun 2023
Understanding Augmentation-based Self-Supervised Representation Learning
  via RKHS Approximation and Regression
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression
Runtian Zhai
Bing Liu
Andrej Risteski
Zico Kolter
Pradeep Ravikumar
SSL
35
9
0
01 Jun 2023
Combinatorial Neural Bandits
Combinatorial Neural Bandits
Taehyun Hwang
Kyuwook Chai
Min Hwan Oh
18
4
0
31 May 2023
Data Representations' Study of Latent Image Manifolds
Data Representations' Study of Latent Image Manifolds
Ilya Kaufman
Omri Azencot
14
7
0
31 May 2023
Fine-grained analysis of non-parametric estimation for pairwise learning
Fine-grained analysis of non-parametric estimation for pairwise learning
Junyu Zhou
Shuo Huang
Han Feng
Puyu Wang
Ding-Xuan Zhou
43
1
0
31 May 2023
Benign Overfitting in Deep Neural Networks under Lazy Training
Benign Overfitting in Deep Neural Networks under Lazy Training
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Francesco Locatello
V. Cevher
AI4CE
26
10
0
30 May 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural
  Networks
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks
Puyu Wang
Yunwen Lei
Di Wang
Yiming Ying
Ding-Xuan Zhou
MLT
29
4
0
26 May 2023
Scan and Snap: Understanding Training Dynamics and Token Composition in
  1-layer Transformer
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian
Yiping Wang
Beidi Chen
S. Du
MLT
36
71
0
25 May 2023
Tight conditions for when the NTK approximation is valid
Tight conditions for when the NTK approximation is valid
Enric Boix-Adserà
Etai Littwin
30
0
0
22 May 2023
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
Eshaan Nichani
Alexandru Damian
Jason D. Lee
MLT
44
13
0
11 May 2023
PGrad: Learning Principal Gradients For Domain Generalization
PGrad: Learning Principal Gradients For Domain Generalization
Zhe Wang
J. E. Grigsby
Yanjun Qi
OOD
29
10
0
02 May 2023
Toward $L_\infty$-recovery of Nonlinear Functions: A Polynomial Sample
  Complexity Bound for Gaussian Random Fields
Toward L∞L_\inftyL∞​-recovery of Nonlinear Functions: A Polynomial Sample Complexity Bound for Gaussian Random Fields
Kefan Dong
Tengyu Ma
43
4
0
29 Apr 2023
Learning Narrow One-Hidden-Layer ReLU Networks
Learning Narrow One-Hidden-Layer ReLU Networks
Sitan Chen
Zehao Dou
Surbhi Goel
Adam R. Klivans
Raghu Meka
MLT
24
13
0
20 Apr 2023
Depth Separation with Multilayer Mean-Field Networks
Depth Separation with Multilayer Mean-Field Networks
Y. Ren
Mo Zhou
Rong Ge
OOD
22
3
0
03 Apr 2023
On the Stepwise Nature of Self-Supervised Learning
On the Stepwise Nature of Self-Supervised Learning
James B. Simon
Maksis Knutins
Liu Ziyin
Daniel Geisz
Abraham J. Fetterman
Joshua Albrecht
SSL
37
30
0
27 Mar 2023
Learning Fractals by Gradient Descent
Learning Fractals by Gradient Descent
Cheng-Hao Tu
Hong-You Chen
David Carlyn
Wei-Lun Chao
23
2
0
14 Mar 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for
  Learning a Single Neuron
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu
S. Du
37
16
0
20 Feb 2023
Generalization and Stability of Interpolating Neural Networks with
  Minimal Width
Generalization and Stability of Interpolating Neural Networks with Minimal Width
Hossein Taheri
Christos Thrampoulidis
40
16
0
18 Feb 2023
Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion
  of Spurious Solutions to Strict Saddle Points
Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
Ziye Ma
Igor Molybog
Javad Lavaei
Somayeh Sojoudi
31
3
0
15 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning,
  Generalization, and Sample Complexity
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
Ming Wang
Sijia Liu
Pin-Yu Chen
ViT
MLT
37
57
0
12 Feb 2023
Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural
  Networks
Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural Networks
Shuai Zhang
Ming Wang
Pin-Yu Chen
Sijia Liu
Songtao Lu
Miaoyuan Liu
MLT
27
16
0
06 Feb 2023
Previous
12345...8910
Next