Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers

12 November 2018

Papers citing "Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers"

50 / 498 papers shown

Title
MergeBench: A Benchmark for Merging Domain-Specialized LLMs Yifei He Siqi Zeng Yuzheng Hu Rui Yang Tong Zhang Han Zhao MoMe ALM 24 0 0 16 May 2025
The Power of Random Features and the Limits of Distribution-Free Gradient Descent Ari Karchmer Eran Malach 21 0 0 15 May 2025
Block-Biased Mamba for Long-Range Sequence Processing Annan Yu N. Benjamin Erichson Mamba 45 0 0 13 May 2025
Deep Learning Optimization Using Self-Adaptive Weighted Auxiliary Variables Yaru Liu Yiqi Gu Michael K. Ng ODL 67 0 0 30 Apr 2025
Do Two AI Scientists Agree? Xinghong Fu Ziming Liu Max Tegmark 39 0 0 03 Apr 2025
Towards Understanding the Optimization Mechanisms in Deep Learning Binchuan Qi Wei Gong Li Li 52 0 0 29 Mar 2025
Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry Chi-Ning Chou Hang Le Yichen Wang SueYeon Chung 53 0 0 23 Mar 2025
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality Ruijia Zhang Siliang Zeng Chenliang Li Alfredo García Mingyi Hong 67 0 0 22 Mar 2025
High-entropy Advantage in Neural Networks' Generalizability Entao Yang Jiahui Geng Yue Shang Ge Zhang AI4CE 66 0 0 17 Mar 2025
Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches Yifang Chen Xuyang Guo Xiaoyu Li Yingyu Liang Zhenmei Shi Zhao Song 73 3 0 03 Mar 2025
A Near Complete Nonasymptotic Generalization Theory For Multilayer Neural Networks: Beyond the Bias-Variance Tradeoff Hao Yu Xiangyang Ji AI4CE 60 0 0 03 Mar 2025
Sharper Risk Bound for Multi-Task Learning with Multi-Graph Dependent Data Xiao Shao Guoqiang Wu 60 0 0 25 Feb 2025
Learning Neural Networks with Distribution Shift: Efficiently Certifiable Guarantees Gautam Chandrasekaran Adam R. Klivans Lin Lin Lee Konstantinos Stavropoulos OOD 40 0 0 22 Feb 2025
Optimization Insights into Deep Diagonal Linear Networks Hippolyte Labarrière C. Molinari Lorenzo Rosasco S. Villa Cristian Vega 76 0 0 21 Dec 2024
Accelerated zero-order SGD under high-order smoothness and overparameterized regime Georgii Bychkov D. Dvinskikh Anastasia Antsiferova Alexander Gasnikov Aleksandr Lobanov 66 0 0 21 Nov 2024
Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections M. Miani Hrittik Roy Søren Hauberg UQCV BDL 37 0 0 22 Oct 2024
Rethinking generalization of classifiers in separable classes scenarios and over-parameterized regimes Julius Martinetz C. Linse Thomas Martinetz 26 0 0 22 Oct 2024
Which Spaces can be Embedded in $L_p$ -type Reproducing Kernel Banach Space? A Characterization via Metric Entropy Yiping Lu Daozhe Lin Qiang Du 39 0 0 14 Oct 2024
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning? Khashayar Gatmiry Nikunj Saunshi Sashank J. Reddi Stefanie Jegelka Sanjiv Kumar 69 17 0 10 Oct 2024
Extended convexity and smoothness and their applications in deep learning Binchuan Qi Wei Gong Li Li 63 0 0 08 Oct 2024
On the Hardness of Learning One Hidden Layer Neural Networks Shuchen Li Ilias Zadik Manolis Zampetakis 26 2 0 04 Oct 2024
VLM's Eye Examination: Instruct and Inspect Visual Competency of Vision Language Models Nam Hyeon-Woo Moon Ye-Bin Wonseok Choi Lee Hyun Tae-Hyun Oh CoGe 28 3 0 23 Sep 2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks Clémentine Dominé Nicolas Anguita A. Proca Lukas Braun D. Kunin P. Mediano Andrew M. Saxe 38 3 0 22 Sep 2024
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes Nikita Kiselev Andrey Grabovoy 54 1 0 18 Sep 2024
Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning Mohammadamin Banayeeanzade Mahdi Soltanolkotabi Mohammad Rostami CLL LRM 103 1 0 29 Aug 2024
On the Generalization of Preference Learning with DPO Shawn Im Yixuan Li 52 1 0 06 Aug 2024
Quantum Supervised Learning A. Macaluso 49 3 0 24 Jul 2024
Invertible Neural Warp for NeRF Shin-Fang Chng Ravi Garg Hemanth Saratchandran Simon Lucey 38 2 0 17 Jul 2024
Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors Matt Gorbett Hossein Shirazi Indrakshi Ray MQ 48 0 0 16 Jul 2024
First-Order Manifold Data Augmentation for Regression Learning Ilya Kaufman Omri Azencot 33 3 0 16 Jun 2024
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning D. Kunin Allan Raventós Clémentine Dominé Feng Chen David Klindt Andrew M. Saxe Surya Ganguli MLT 48 15 0 10 Jun 2024
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation Can Yaras Peng Wang Laura Balzano Qing Qu AI4CE 37 13 0 06 Jun 2024
What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding Hongkang Li Meng Wang Tengfei Ma Sijia Liu Zaixi Zhang Pin-Yu Chen MLT AI4CE 50 10 0 04 Jun 2024
Learning Analysis of Kernel Ridgeless Regression with Asymmetric Kernel Learning Fan He Mingzhe He Lei Shi Xiaolin Huang Johan A. K. Suykens 33 1 0 03 Jun 2024
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts Mohammed Nowaz Rabbani Chowdhury Meng Wang Kaoutar El Maghraoui Naigang Wang Pin-Yu Chen Christopher Carothers MoE 39 4 0 26 May 2024
Generalized Laplace Approximation Yinsong Chen Samson S. Yu Zhong Li Chee Peng Lim BDL 53 0 0 22 May 2024
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks Zhifa Ke Zaiwen Wen Junyu Zhang 37 0 0 07 May 2024
Data Augmentation Policy Search for Long-Term Forecasting Liran Nochumsohn Omri Azencot AI4TS TPM 46 3 0 01 May 2024
Regularized Gauss-Newton for Optimizing Overparameterized Neural Networks Adeyemi Damilare Adeoye Philipp Christian Petersen Alberto Bemporad 30 1 0 23 Apr 2024
On the Empirical Complexity of Reasoning and Planning in LLMs Liwei Kang Zirui Zhao David Hsu Wee Sun Lee LRM 30 5 0 17 Apr 2024
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks Matteo Tucat Anirbit Mukherjee Procheta Sen Mingfei Sun Omar Rivasplata MLT 39 1 0 12 Apr 2024
From Activation to Initialization: Scaling Insights for Optimizing Neural Fields Hemanth Saratchandran Sameera Ramasinghe Simon Lucey AI4CE 41 1 0 28 Mar 2024
FedFisher: Leveraging Fisher Information for One-Shot Federated Learning Divyansh Jhunjhunwala Shiqiang Wang Gauri Joshi FedML 33 6 0 19 Mar 2024
$How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance$ How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance Hongkang Li Shuai Zhang Yihua Zhang Meng Wang Sijia Liu Pin-Yu Chen 41 4 0 12 Mar 2024
On the Diminishing Returns of Width for Continual Learning E. Guha V. Lakshman CLL 39 4 0 11 Mar 2024
A priori Estimates for Deep Residual Network in Continuous-time Reinforcement Learning Shuyu Yin Qixuan Zhou Fei Wen Tao Luo 32 0 0 24 Feb 2024
How Do Nonlinear Transformers Learn and Generalize in In-Context Learning? Hongkang Li Meng Wang Songtao Lu Xiaodong Cui Pin-Yu Chen MLT 46 14 0 23 Feb 2024
SAE: Single Architecture Ensemble Neural Networks Martin Ferianc Hongxiang Fan Miguel R. D. Rodrigues UQCV 24 0 0 09 Feb 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding Zhengqing Wu Berfin Simsek Francois Ged ODL 48 0 0 08 Feb 2024
Sample, estimate, aggregate: A recipe for causal discovery foundation models Menghua Wu Yujia Bao Regina Barzilay Tommi Jaakkola CML 49 7 0 02 Feb 2024