Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.04918
Cited By
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
12 November 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers"
50 / 498 papers shown
Title
Neural Network-Based Score Estimation in Diffusion Models: Optimization and Generalization
Yinbin Han
Meisam Razaviyayn
Renyuan Xu
DiffM
51
12
0
28 Jan 2024
Improving conversion rate prediction via self-supervised pre-training in online advertising
Alex Shtoff
Yohay Kaplan
Ariel Raviv
16
0
0
25 Jan 2024
RedEx: Beyond Fixed Representation Methods via Convex Optimization
Amit Daniely
Mariano Schain
Gilad Yehudai
27
0
0
15 Jan 2024
A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models
Namjoon Suh
Guang Cheng
MedIm
30
13
0
14 Jan 2024
Deep Learning With DAGs
Sourabh Vivek Balgi
Adel Daoud
Jose M. Pena
G. Wodtke
Jesse Zhou
AI4CE
CML
35
1
0
12 Jan 2024
Weak Correlations as the Underlying Principle for Linearization of Gradient-Based Learning Systems
Ori Shem-Ur
Yaron Oz
19
0
0
08 Jan 2024
Unraveling the Key Components of OOD Generalization via Diversification
Harold Benoit
Liangze Jiang
Andrei Atanov
Ouguzhan Fatih Kar
Mattia Rigotti
Amir Zamir
CML
31
2
0
26 Dec 2023
Resource-Limited Automated Ki67 Index Estimation in Breast Cancer
J. Gliozzo
Giosuè Cataldo Marinò
A. Bonometti
Marco Frasca
Dario Malchiodi
21
0
0
22 Dec 2023
A note on regularised NTK dynamics with an application to PAC-Bayesian training
Eugenio Clerico
Benjamin Guedj
35
0
0
20 Dec 2023
Improving the Expressive Power of Deep Neural Networks through Integral Activation Transform
Zezhong Zhang
Feng Bao
Guannan Zhang
22
0
0
19 Dec 2023
FedEmb: A Vertical and Hybrid Federated Learning Algorithm using Network And Feature Embedding Aggregation
Fanfei Meng
Lele Zhang
Yu Chen
Yuxin Wang
FedML
19
4
0
30 Nov 2023
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
47
1
0
29 Nov 2023
Learning in Deep Factor Graphs with Gaussian Belief Propagation
Seth Nabarro
Mark van der Wilk
Andrew J Davison
BDL
28
0
0
24 Nov 2023
Randomly Weighted Neuromodulation in Neural Networks Facilitates Learning of Manifolds Common Across Tasks
Jinyung Hong
Theodore P. Pavlic
12
0
0
17 Nov 2023
Efficient Compression of Overparameterized Deep Models through Low-Dimensional Learning Dynamics
Soo Min Kwon
Zekai Zhang
Dogyoon Song
Laura Balzano
Qing Qu
53
2
0
08 Nov 2023
Wide Neural Networks as Gaussian Processes: Lessons from Deep Equilibrium Models
Tianxiang Gao
Xiaokai Huo
Hailiang Liu
Hongyang Gao
BDL
25
8
0
16 Oct 2023
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
Nuoya Xiong
Lijun Ding
Simon S. Du
48
11
0
03 Oct 2023
The Gift of Feedback: Improving ASR Model Quality by Learning from User Corrections through Federated Learning
Lillian Zhou
Yuxin Ding
Mingqing Chen
Harry Zhang
Rohit Prabhavalkar
Dhruv Guliani
Giovanni Motta
Rajiv Mathews
6
1
0
29 Sep 2023
Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets
Pulkit Gopalani
Samyak Jha
Anirbit Mukherjee
19
2
0
17 Sep 2023
Fundamental Limits of Deep Learning-Based Binary Classifiers Trained with Hinge Loss
T. Getu
Georges Kaddoum
M. Bennis
40
1
0
13 Sep 2023
Modify Training Directions in Function Space to Reduce Generalization Error
Yi Yu
Wenlian Lu
Boyu Chen
27
0
0
25 Jul 2023
A faster and simpler algorithm for learning shallow networks
Sitan Chen
Shyam Narayanan
41
7
0
24 Jul 2023
What can a Single Attention Layer Learn? A Study Through the Random Features Lens
Hengyu Fu
Tianyu Guo
Yu Bai
Song Mei
MLT
35
22
0
21 Jul 2023
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks
Liam Collins
Hamed Hassani
Mahdi Soltanolkotabi
Aryan Mokhtari
Sanjay Shakkottai
39
10
0
13 Jul 2023
Zero-Shot Neural Architecture Search: Challenges, Solutions, and Opportunities
Guihong Li
Duc-Tuong Hoang
Kartikeya Bhardwaj
Ming Lin
Zhangyang Wang
R. Marculescu
40
10
0
05 Jul 2023
Neural Hilbert Ladders: Multi-Layer Neural Networks in Function Space
Zhengdao Chen
44
1
0
03 Jul 2023
Scaling MLPs: A Tale of Inductive Bias
Gregor Bachmann
Sotiris Anagnostidis
Thomas Hofmann
40
38
0
23 Jun 2023
Predicting Grokking Long Before it Happens: A look into the loss landscape of models which grok
Pascal Junior Tikeng Notsawo
Hattie Zhou
Mohammad Pezeshki
Irina Rish
G. Dumas
25
23
0
23 Jun 2023
Efficient Uncertainty Quantification and Reduction for Over-Parameterized Neural Networks
Ziyi Huang
H. Lam
Haofeng Zhang
UQCV
26
4
0
09 Jun 2023
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks
Mohammed Nowaz Rabbani Chowdhury
Shuai Zhang
Ming Wang
Sijia Liu
Pin-Yu Chen
MoE
29
17
0
07 Jun 2023
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression
Runtian Zhai
Bing Liu
Andrej Risteski
Zico Kolter
Pradeep Ravikumar
SSL
35
9
0
01 Jun 2023
Combinatorial Neural Bandits
Taehyun Hwang
Kyuwook Chai
Min Hwan Oh
18
4
0
31 May 2023
Data Representations' Study of Latent Image Manifolds
Ilya Kaufman
Omri Azencot
14
7
0
31 May 2023
Fine-grained analysis of non-parametric estimation for pairwise learning
Junyu Zhou
Shuo Huang
Han Feng
Puyu Wang
Ding-Xuan Zhou
43
1
0
31 May 2023
Benign Overfitting in Deep Neural Networks under Lazy Training
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Francesco Locatello
V. Cevher
AI4CE
26
10
0
30 May 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks
Puyu Wang
Yunwen Lei
Di Wang
Yiming Ying
Ding-Xuan Zhou
MLT
29
4
0
26 May 2023
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian
Yiping Wang
Beidi Chen
S. Du
MLT
36
71
0
25 May 2023
Tight conditions for when the NTK approximation is valid
Enric Boix-Adserà
Etai Littwin
30
0
0
22 May 2023
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
Eshaan Nichani
Alexandru Damian
Jason D. Lee
MLT
44
13
0
11 May 2023
PGrad: Learning Principal Gradients For Domain Generalization
Zhe Wang
J. E. Grigsby
Yanjun Qi
OOD
29
10
0
02 May 2023
Toward
L
∞
L_\infty
L
∞
-recovery of Nonlinear Functions: A Polynomial Sample Complexity Bound for Gaussian Random Fields
Kefan Dong
Tengyu Ma
43
4
0
29 Apr 2023
Learning Narrow One-Hidden-Layer ReLU Networks
Sitan Chen
Zehao Dou
Surbhi Goel
Adam R. Klivans
Raghu Meka
MLT
24
13
0
20 Apr 2023
Depth Separation with Multilayer Mean-Field Networks
Y. Ren
Mo Zhou
Rong Ge
OOD
22
3
0
03 Apr 2023
On the Stepwise Nature of Self-Supervised Learning
James B. Simon
Maksis Knutins
Liu Ziyin
Daniel Geisz
Abraham J. Fetterman
Joshua Albrecht
SSL
37
30
0
27 Mar 2023
Learning Fractals by Gradient Descent
Cheng-Hao Tu
Hong-You Chen
David Carlyn
Wei-Lun Chao
23
2
0
14 Mar 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu
S. Du
37
16
0
20 Feb 2023
Generalization and Stability of Interpolating Neural Networks with Minimal Width
Hossein Taheri
Christos Thrampoulidis
40
16
0
18 Feb 2023
Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
Ziye Ma
Igor Molybog
Javad Lavaei
Somayeh Sojoudi
31
3
0
15 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
Ming Wang
Sijia Liu
Pin-Yu Chen
ViT
MLT
37
57
0
12 Feb 2023
Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural Networks
Shuai Zhang
Ming Wang
Pin-Yu Chen
Sijia Liu
Songtao Lu
Miaoyuan Liu
MLT
27
16
0
06 Feb 2023
Previous
1
2
3
4
5
...
8
9
10
Next