ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.04918
  4. Cited By
Learning and Generalization in Overparameterized Neural Networks, Going
  Beyond Two Layers

Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers

12 November 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
    MLT
ArXivPDFHTML

Papers citing "Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers"

50 / 498 papers shown
Title
MergeBench: A Benchmark for Merging Domain-Specialized LLMs
MergeBench: A Benchmark for Merging Domain-Specialized LLMs
Yifei He
Siqi Zeng
Yuzheng Hu
Rui Yang
Tong Zhang
Han Zhao
MoMe
ALM
24
0
0
16 May 2025
The Power of Random Features and the Limits of Distribution-Free Gradient Descent
The Power of Random Features and the Limits of Distribution-Free Gradient Descent
Ari Karchmer
Eran Malach
21
0
0
15 May 2025
Block-Biased Mamba for Long-Range Sequence Processing
Block-Biased Mamba for Long-Range Sequence Processing
Annan Yu
N. Benjamin Erichson
Mamba
45
0
0
13 May 2025
Deep Learning Optimization Using Self-Adaptive Weighted Auxiliary Variables
Deep Learning Optimization Using Self-Adaptive Weighted Auxiliary Variables
Yaru Liu
Yiqi Gu
Michael K. Ng
ODL
67
0
0
30 Apr 2025
Do Two AI Scientists Agree?
Do Two AI Scientists Agree?
Xinghong Fu
Ziming Liu
Max Tegmark
39
0
0
03 Apr 2025
Towards Understanding the Optimization Mechanisms in Deep Learning
Towards Understanding the Optimization Mechanisms in Deep Learning
Binchuan Qi
Wei Gong
Li Li
52
0
0
29 Mar 2025
Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry
Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry
Chi-Ning Chou
Hang Le
Yichen Wang
SueYeon Chung
53
0
0
23 Mar 2025
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality
Ruijia Zhang
Siliang Zeng
Chenliang Li
Alfredo García
Mingyi Hong
67
0
0
22 Mar 2025
High-entropy Advantage in Neural Networks' Generalizability
High-entropy Advantage in Neural Networks' Generalizability
Entao Yang
Jiahui Geng
Yue Shang
Ge Zhang
AI4CE
66
0
0
17 Mar 2025
Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches
Yifang Chen
Xuyang Guo
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao Song
73
3
0
03 Mar 2025
A Near Complete Nonasymptotic Generalization Theory For Multilayer Neural Networks: Beyond the Bias-Variance Tradeoff
Hao Yu
Xiangyang Ji
AI4CE
60
0
0
03 Mar 2025
Sharper Risk Bound for Multi-Task Learning with Multi-Graph Dependent Data
Sharper Risk Bound for Multi-Task Learning with Multi-Graph Dependent Data
Xiao Shao
Guoqiang Wu
60
0
0
25 Feb 2025
Learning Neural Networks with Distribution Shift: Efficiently Certifiable Guarantees
Gautam Chandrasekaran
Adam R. Klivans
Lin Lin Lee
Konstantinos Stavropoulos
OOD
40
0
0
22 Feb 2025
Optimization Insights into Deep Diagonal Linear Networks
Optimization Insights into Deep Diagonal Linear Networks
Hippolyte Labarrière
C. Molinari
Lorenzo Rosasco
S. Villa
Cristian Vega
76
0
0
21 Dec 2024
Accelerated zero-order SGD under high-order smoothness and
  overparameterized regime
Accelerated zero-order SGD under high-order smoothness and overparameterized regime
Georgii Bychkov
D. Dvinskikh
Anastasia Antsiferova
Alexander Gasnikov
Aleksandr Lobanov
66
0
0
21 Nov 2024
Bayes without Underfitting: Fully Correlated Deep Learning Posteriors
  via Alternating Projections
Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections
M. Miani
Hrittik Roy
Søren Hauberg
UQCV
BDL
37
0
0
22 Oct 2024
Rethinking generalization of classifiers in separable classes scenarios
  and over-parameterized regimes
Rethinking generalization of classifiers in separable classes scenarios and over-parameterized regimes
Julius Martinetz
C. Linse
Thomas Martinetz
26
0
0
22 Oct 2024
Which Spaces can be Embedded in $L_p$-type Reproducing Kernel Banach
  Space? A Characterization via Metric Entropy
Which Spaces can be Embedded in LpL_pLp​-type Reproducing Kernel Banach Space? A Characterization via Metric Entropy
Yiping Lu
Daozhe Lin
Qiang Du
39
0
0
14 Oct 2024
Can Looped Transformers Learn to Implement Multi-step Gradient Descent
  for In-context Learning?
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Khashayar Gatmiry
Nikunj Saunshi
Sashank J. Reddi
Stefanie Jegelka
Sanjiv Kumar
69
17
0
10 Oct 2024
Extended convexity and smoothness and their applications in deep learning
Extended convexity and smoothness and their applications in deep learning
Binchuan Qi
Wei Gong
Li Li
63
0
0
08 Oct 2024
On the Hardness of Learning One Hidden Layer Neural Networks
On the Hardness of Learning One Hidden Layer Neural Networks
Shuchen Li
Ilias Zadik
Manolis Zampetakis
26
2
0
04 Oct 2024
VLM's Eye Examination: Instruct and Inspect Visual Competency of Vision
  Language Models
VLM's Eye Examination: Instruct and Inspect Visual Competency of Vision Language Models
Nam Hyeon-Woo
Moon Ye-Bin
Wonseok Choi
Lee Hyun
Tae-Hyun Oh
CoGe
28
3
0
23 Sep 2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Clémentine Dominé
Nicolas Anguita
A. Proca
Lukas Braun
D. Kunin
P. Mediano
Andrew M. Saxe
38
3
0
22 Sep 2024
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function
  Landscapes
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes
Nikita Kiselev
Andrey Grabovoy
54
1
0
18 Sep 2024
Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning
Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning
Mohammadamin Banayeeanzade
Mahdi Soltanolkotabi
Mohammad Rostami
CLL
LRM
103
1
0
29 Aug 2024
On the Generalization of Preference Learning with DPO
On the Generalization of Preference Learning with DPO
Shawn Im
Yixuan Li
52
1
0
06 Aug 2024
Quantum Supervised Learning
Quantum Supervised Learning
A. Macaluso
49
3
0
24 Jul 2024
Invertible Neural Warp for NeRF
Invertible Neural Warp for NeRF
Shin-Fang Chng
Ravi Garg
Hemanth Saratchandran
Simon Lucey
38
2
0
17 Jul 2024
Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of
  Learnable Binary Vectors
Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors
Matt Gorbett
Hossein Shirazi
Indrakshi Ray
MQ
48
0
0
16 Jul 2024
First-Order Manifold Data Augmentation for Regression Learning
First-Order Manifold Data Augmentation for Regression Learning
Ilya Kaufman
Omri Azencot
33
3
0
16 Jun 2024
Get rich quick: exact solutions reveal how unbalanced initializations
  promote rapid feature learning
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
D. Kunin
Allan Raventós
Clémentine Dominé
Feng Chen
David Klindt
Andrew M. Saxe
Surya Ganguli
MLT
48
15
0
10 Jun 2024
Compressible Dynamics in Deep Overparameterized Low-Rank Learning &
  Adaptation
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Can Yaras
Peng Wang
Laura Balzano
Qing Qu
AI4CE
37
13
0
06 Jun 2024
What Improves the Generalization of Graph Transformers? A Theoretical
  Dive into the Self-attention and Positional Encoding
What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding
Hongkang Li
Meng Wang
Tengfei Ma
Sijia Liu
Zaixi Zhang
Pin-Yu Chen
MLT
AI4CE
50
10
0
04 Jun 2024
Learning Analysis of Kernel Ridgeless Regression with Asymmetric Kernel
  Learning
Learning Analysis of Kernel Ridgeless Regression with Asymmetric Kernel Learning
Fan He
Mingzhe He
Lei Shi
Xiaolin Huang
Johan A. K. Suykens
33
1
0
03 Jun 2024
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse
  Mixture-of-Experts
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts
Mohammed Nowaz Rabbani Chowdhury
Meng Wang
Kaoutar El Maghraoui
Naigang Wang
Pin-Yu Chen
Christopher Carothers
MoE
39
4
0
26 May 2024
Generalized Laplace Approximation
Generalized Laplace Approximation
Yinsong Chen
Samson S. Yu
Zhong Li
Chee Peng Lim
BDL
53
0
0
22 May 2024
An Improved Finite-time Analysis of Temporal Difference Learning with
  Deep Neural Networks
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks
Zhifa Ke
Zaiwen Wen
Junyu Zhang
37
0
0
07 May 2024
Data Augmentation Policy Search for Long-Term Forecasting
Data Augmentation Policy Search for Long-Term Forecasting
Liran Nochumsohn
Omri Azencot
AI4TS
TPM
46
3
0
01 May 2024
Regularized Gauss-Newton for Optimizing Overparameterized Neural
  Networks
Regularized Gauss-Newton for Optimizing Overparameterized Neural Networks
Adeyemi Damilare Adeoye
Philipp Christian Petersen
Alberto Bemporad
30
1
0
23 Apr 2024
On the Empirical Complexity of Reasoning and Planning in LLMs
On the Empirical Complexity of Reasoning and Planning in LLMs
Liwei Kang
Zirui Zhao
David Hsu
Wee Sun Lee
LRM
30
5
0
17 Apr 2024
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
39
1
0
12 Apr 2024
From Activation to Initialization: Scaling Insights for Optimizing
  Neural Fields
From Activation to Initialization: Scaling Insights for Optimizing Neural Fields
Hemanth Saratchandran
Sameera Ramasinghe
Simon Lucey
AI4CE
41
1
0
28 Mar 2024
FedFisher: Leveraging Fisher Information for One-Shot Federated Learning
FedFisher: Leveraging Fisher Information for One-Shot Federated Learning
Divyansh Jhunjhunwala
Shiqiang Wang
Gauri Joshi
FedML
33
6
0
19 Mar 2024
How does promoting the minority fraction affect generalization? A
  theoretical study of the one-hidden-layer neural network on group imbalance
How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance
Hongkang Li
Shuai Zhang
Yihua Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
41
4
0
12 Mar 2024
On the Diminishing Returns of Width for Continual Learning
On the Diminishing Returns of Width for Continual Learning
E. Guha
V. Lakshman
CLL
39
4
0
11 Mar 2024
A priori Estimates for Deep Residual Network in Continuous-time
  Reinforcement Learning
A priori Estimates for Deep Residual Network in Continuous-time Reinforcement Learning
Shuyu Yin
Qixuan Zhou
Fei Wen
Tao Luo
32
0
0
24 Feb 2024
How Do Nonlinear Transformers Learn and Generalize in In-Context
  Learning?
How Do Nonlinear Transformers Learn and Generalize in In-Context Learning?
Hongkang Li
Meng Wang
Songtao Lu
Xiaodong Cui
Pin-Yu Chen
MLT
46
14
0
23 Feb 2024
SAE: Single Architecture Ensemble Neural Networks
SAE: Single Architecture Ensemble Neural Networks
Martin Ferianc
Hongxiang Fan
Miguel R. D. Rodrigues
UQCV
24
0
0
09 Feb 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Zhengqing Wu
Berfin Simsek
Francois Ged
ODL
48
0
0
08 Feb 2024
Sample, estimate, aggregate: A recipe for causal discovery foundation models
Sample, estimate, aggregate: A recipe for causal discovery foundation models
Menghua Wu
Yujia Bao
Regina Barzilay
Tommi Jaakkola
CML
49
7
0
02 Feb 2024
1234...8910
Next