Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.04918
Cited By
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
12 November 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers"
50 / 498 papers shown
Title
MergeBench: A Benchmark for Merging Domain-Specialized LLMs
Yifei He
Siqi Zeng
Yuzheng Hu
Rui Yang
Tong Zhang
Han Zhao
MoMe
ALM
24
0
0
16 May 2025
The Power of Random Features and the Limits of Distribution-Free Gradient Descent
Ari Karchmer
Eran Malach
21
0
0
15 May 2025
Block-Biased Mamba for Long-Range Sequence Processing
Annan Yu
N. Benjamin Erichson
Mamba
45
0
0
13 May 2025
Deep Learning Optimization Using Self-Adaptive Weighted Auxiliary Variables
Yaru Liu
Yiqi Gu
Michael K. Ng
ODL
67
0
0
30 Apr 2025
Do Two AI Scientists Agree?
Xinghong Fu
Ziming Liu
Max Tegmark
39
0
0
03 Apr 2025
Towards Understanding the Optimization Mechanisms in Deep Learning
Binchuan Qi
Wei Gong
Li Li
52
0
0
29 Mar 2025
Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry
Chi-Ning Chou
Hang Le
Yichen Wang
SueYeon Chung
53
0
0
23 Mar 2025
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality
Ruijia Zhang
Siliang Zeng
Chenliang Li
Alfredo García
Mingyi Hong
67
0
0
22 Mar 2025
High-entropy Advantage in Neural Networks' Generalizability
Entao Yang
Jiahui Geng
Yue Shang
Ge Zhang
AI4CE
66
0
0
17 Mar 2025
Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches
Yifang Chen
Xuyang Guo
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao Song
73
3
0
03 Mar 2025
A Near Complete Nonasymptotic Generalization Theory For Multilayer Neural Networks: Beyond the Bias-Variance Tradeoff
Hao Yu
Xiangyang Ji
AI4CE
60
0
0
03 Mar 2025
Sharper Risk Bound for Multi-Task Learning with Multi-Graph Dependent Data
Xiao Shao
Guoqiang Wu
60
0
0
25 Feb 2025
Learning Neural Networks with Distribution Shift: Efficiently Certifiable Guarantees
Gautam Chandrasekaran
Adam R. Klivans
Lin Lin Lee
Konstantinos Stavropoulos
OOD
40
0
0
22 Feb 2025
Optimization Insights into Deep Diagonal Linear Networks
Hippolyte Labarrière
C. Molinari
Lorenzo Rosasco
S. Villa
Cristian Vega
76
0
0
21 Dec 2024
Accelerated zero-order SGD under high-order smoothness and overparameterized regime
Georgii Bychkov
D. Dvinskikh
Anastasia Antsiferova
Alexander Gasnikov
Aleksandr Lobanov
66
0
0
21 Nov 2024
Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections
M. Miani
Hrittik Roy
Søren Hauberg
UQCV
BDL
37
0
0
22 Oct 2024
Rethinking generalization of classifiers in separable classes scenarios and over-parameterized regimes
Julius Martinetz
C. Linse
Thomas Martinetz
26
0
0
22 Oct 2024
Which Spaces can be Embedded in
L
p
L_p
L
p
-type Reproducing Kernel Banach Space? A Characterization via Metric Entropy
Yiping Lu
Daozhe Lin
Qiang Du
39
0
0
14 Oct 2024
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Khashayar Gatmiry
Nikunj Saunshi
Sashank J. Reddi
Stefanie Jegelka
Sanjiv Kumar
69
17
0
10 Oct 2024
Extended convexity and smoothness and their applications in deep learning
Binchuan Qi
Wei Gong
Li Li
63
0
0
08 Oct 2024
On the Hardness of Learning One Hidden Layer Neural Networks
Shuchen Li
Ilias Zadik
Manolis Zampetakis
26
2
0
04 Oct 2024
VLM's Eye Examination: Instruct and Inspect Visual Competency of Vision Language Models
Nam Hyeon-Woo
Moon Ye-Bin
Wonseok Choi
Lee Hyun
Tae-Hyun Oh
CoGe
28
3
0
23 Sep 2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Clémentine Dominé
Nicolas Anguita
A. Proca
Lukas Braun
D. Kunin
P. Mediano
Andrew M. Saxe
38
3
0
22 Sep 2024
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes
Nikita Kiselev
Andrey Grabovoy
54
1
0
18 Sep 2024
Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning
Mohammadamin Banayeeanzade
Mahdi Soltanolkotabi
Mohammad Rostami
CLL
LRM
103
1
0
29 Aug 2024
On the Generalization of Preference Learning with DPO
Shawn Im
Yixuan Li
52
1
0
06 Aug 2024
Quantum Supervised Learning
A. Macaluso
49
3
0
24 Jul 2024
Invertible Neural Warp for NeRF
Shin-Fang Chng
Ravi Garg
Hemanth Saratchandran
Simon Lucey
38
2
0
17 Jul 2024
Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors
Matt Gorbett
Hossein Shirazi
Indrakshi Ray
MQ
48
0
0
16 Jul 2024
First-Order Manifold Data Augmentation for Regression Learning
Ilya Kaufman
Omri Azencot
33
3
0
16 Jun 2024
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
D. Kunin
Allan Raventós
Clémentine Dominé
Feng Chen
David Klindt
Andrew M. Saxe
Surya Ganguli
MLT
48
15
0
10 Jun 2024
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Can Yaras
Peng Wang
Laura Balzano
Qing Qu
AI4CE
37
13
0
06 Jun 2024
What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding
Hongkang Li
Meng Wang
Tengfei Ma
Sijia Liu
Zaixi Zhang
Pin-Yu Chen
MLT
AI4CE
50
10
0
04 Jun 2024
Learning Analysis of Kernel Ridgeless Regression with Asymmetric Kernel Learning
Fan He
Mingzhe He
Lei Shi
Xiaolin Huang
Johan A. K. Suykens
33
1
0
03 Jun 2024
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts
Mohammed Nowaz Rabbani Chowdhury
Meng Wang
Kaoutar El Maghraoui
Naigang Wang
Pin-Yu Chen
Christopher Carothers
MoE
39
4
0
26 May 2024
Generalized Laplace Approximation
Yinsong Chen
Samson S. Yu
Zhong Li
Chee Peng Lim
BDL
53
0
0
22 May 2024
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks
Zhifa Ke
Zaiwen Wen
Junyu Zhang
37
0
0
07 May 2024
Data Augmentation Policy Search for Long-Term Forecasting
Liran Nochumsohn
Omri Azencot
AI4TS
TPM
46
3
0
01 May 2024
Regularized Gauss-Newton for Optimizing Overparameterized Neural Networks
Adeyemi Damilare Adeoye
Philipp Christian Petersen
Alberto Bemporad
30
1
0
23 Apr 2024
On the Empirical Complexity of Reasoning and Planning in LLMs
Liwei Kang
Zirui Zhao
David Hsu
Wee Sun Lee
LRM
30
5
0
17 Apr 2024
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
39
1
0
12 Apr 2024
From Activation to Initialization: Scaling Insights for Optimizing Neural Fields
Hemanth Saratchandran
Sameera Ramasinghe
Simon Lucey
AI4CE
41
1
0
28 Mar 2024
FedFisher: Leveraging Fisher Information for One-Shot Federated Learning
Divyansh Jhunjhunwala
Shiqiang Wang
Gauri Joshi
FedML
33
6
0
19 Mar 2024
How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance
Hongkang Li
Shuai Zhang
Yihua Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
41
4
0
12 Mar 2024
On the Diminishing Returns of Width for Continual Learning
E. Guha
V. Lakshman
CLL
39
4
0
11 Mar 2024
A priori Estimates for Deep Residual Network in Continuous-time Reinforcement Learning
Shuyu Yin
Qixuan Zhou
Fei Wen
Tao Luo
32
0
0
24 Feb 2024
How Do Nonlinear Transformers Learn and Generalize in In-Context Learning?
Hongkang Li
Meng Wang
Songtao Lu
Xiaodong Cui
Pin-Yu Chen
MLT
46
14
0
23 Feb 2024
SAE: Single Architecture Ensemble Neural Networks
Martin Ferianc
Hongxiang Fan
Miguel R. D. Rodrigues
UQCV
24
0
0
09 Feb 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Zhengqing Wu
Berfin Simsek
Francois Ged
ODL
48
0
0
08 Feb 2024
Sample, estimate, aggregate: A recipe for causal discovery foundation models
Menghua Wu
Yujia Bao
Regina Barzilay
Tommi Jaakkola
CML
49
7
0
02 Feb 2024
1
2
3
4
...
8
9
10
Next