An Improved Analysis of Training Over-parameterized Deep Neural Networks

11 June 2019

Quanquan Gu

Papers citing "An Improved Analysis of Training Over-parameterized Deep Neural Networks"

50 / 56 papers shown

Title
Training NTK to Generalize with KARE Johannes Schwab Bryan Kelly Semyon Malamud Teng Andrea Xu 11 0 0 16 May 2025
High-entropy Advantage in Neural Networks' Generalizability Entao Yang Jiahui Geng Yue Shang Ge Zhang AI4CE 66 0 0 17 Mar 2025
Feature Learning Beyond the Edge of Stability Dávid Terjék MLT 48 0 0 18 Feb 2025
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits H. Bui Enrique Mallada Anqi Liu 192 0 0 08 Nov 2024
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes Nikita Kiselev Andrey Grabovoy 54 1 0 18 Sep 2024
Sparse Deep Learning for Time Series Data: Theory and Applications Mingxuan Zhang Y. Sun Faming Liang AI4TS OOD BDL 41 2 0 05 Oct 2023
How to Protect Copyright Data in Optimization of Large Language Models? T. Chu Zhao Song Chiwun Yang 45 29 0 23 Aug 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification Lianke Qin Zhao Song Yuanyuan Yang 30 9 0 13 Jul 2023
Considering Layerwise Importance in the Lottery Ticket Hypothesis Benjamin Vandersmissen José Oramas 37 1 0 22 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity Hongkang Li Ming Wang Sijia Liu Pin-Yu Chen ViT MLT 37 57 0 12 Feb 2023
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning François Caron Fadhel Ayed Paul Jung Hoileong Lee Juho Lee Hongseok Yang 67 2 0 02 Feb 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models Yufeng Zhang Boyi Liu Qi Cai Lingxiao Wang Zhaoran Wang 53 11 0 30 Dec 2022
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing Josh Alman Jiehao Liang Zhao Song Ruizhe Zhang Danyang Zhuo 84 31 0 25 Nov 2022
Characterizing the Spectrum of the NTK via a Power Series Expansion Michael Murray Hui Jin Benjamin Bowman Guido Montúfar 40 11 0 15 Nov 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work Jiawei Zhang Yushun Zhang Mingyi Hong Ruoyu Sun Zhi-Quan Luo 31 10 0 21 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets Pulkit Gopalani Anirbit Mukherjee 26 5 0 20 Oct 2022
On skip connections and normalisation layers in deep optimisation L. MacDonald Jack Valmadre Hemanth Saratchandran Simon Lucey ODL 34 1 0 10 Oct 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in $1d$ R. Gentile G. Welper ODL 56 6 0 17 Sep 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity Jianyi Yang Shaolei Ren 32 3 0 02 Jul 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis Alexander Munteanu Simon Omlor Zhao Song David P. Woodruff 33 15 0 26 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms Lam M. Nguyen Trang H. Tran 32 2 0 13 Jun 2022
Global Convergence of Over-parameterized Deep Equilibrium Models Zenan Ling Xingyu Xie Qiuhao Wang Zongpeng Zhang Zhouchen Lin 34 12 0 27 May 2022
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture Libin Zhu Chaoyue Liu M. Belkin GNN AI4CE 23 4 0 24 May 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks Bartlomiej Polaczyk J. Cyranka ODL 35 3 0 28 Jan 2022
A Kernel-Expanded Stochastic Neural Network Y. Sun F. Liang 25 5 0 14 Jan 2022
Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks Benjamin Bowman Guido Montúfar 28 11 0 12 Jan 2022
Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time Zhao Song Licheng Zhang Ruizhe Zhang 32 64 0 14 Dec 2021
SGD Through the Lens of Kolmogorov Complexity Gregory Schwartzman 39 1 0 10 Nov 2021
Subquadratic Overparameterization for Shallow Neural Networks Chaehwan Song Ali Ramezani-Kebrya Thomas Pethick Armin Eftekhari V. Cevher 30 31 0 02 Nov 2021
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks Shuai Zhang Meng Wang Sijia Liu Pin-Yu Chen Jinjun Xiong UQCV MLT 31 13 0 12 Oct 2021
A global convergence theory for deep ReLU implicit networks via over-parameterization Tianxiang Gao Hailiang Liu Jia Liu Hridesh Rajan Hongyang Gao MLT 36 16 0 11 Oct 2021
Does Preprocessing Help Training Over-parameterized Neural Networks? Zhao Song Shuo Yang Ruizhe Zhang 38 49 0 09 Oct 2021
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization Difan Zou Yuan Cao Yuanzhi Li Quanquan Gu MLT AI4CE 47 39 0 25 Aug 2021
What can linearized neural networks actually say about generalization? Guillermo Ortiz-Jiménez Seyed-Mohsen Moosavi-Dezfooli P. Frossard 29 44 0 12 Jun 2021
Understanding Overparameterization in Generative Adversarial Networks Yogesh Balaji M. Sajedi Neha Kalibhat Mucong Ding Dominik Stöger Mahdi Soltanolkotabi S. Feizi AI4CE 22 21 0 12 Apr 2021
On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths Quynh N. Nguyen 53 48 0 24 Jan 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks Asaf Noy Yi Tian Xu Y. Aflalo Lihi Zelnik-Manor Rong Jin 41 3 0 12 Jan 2021
Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise Spencer Frei Yuan Cao Quanquan Gu FedML MLT 70 19 0 04 Jan 2021
Understanding and Increasing Efficiency of Frank-Wolfe Adversarial Training Theodoros Tsiligkaridis Jay Roberts AAML 22 11 0 22 Dec 2020
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks Quynh N. Nguyen Marco Mondelli Guido Montúfar 25 81 0 21 Dec 2020
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks Zhiqi Bu Shiyun Xu Kan Chen 33 17 0 25 Oct 2020
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders Yibo Jiang Cengiz Pehlevan 19 13 0 30 Jun 2020
Logarithmic Pruning is All You Need Laurent Orseau Marcus Hutter Omar Rivasplata 28 88 0 22 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time Jan van den Brand Binghui Peng Zhao Song Omri Weinstein ODL 29 82 0 20 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory Yufeng Zhang Qi Cai Zhuoran Yang Yongxin Chen Zhaoran Wang OOD MLT 153 11 0 08 Jun 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning Zeyuan Allen-Zhu Yuanzhi Li MLT AAML 39 147 0 20 May 2020
Provable Training of a ReLU Gate with an Iterative Non-Gradient Algorithm Sayar Karmakar Anirbit Mukherjee 14 7 0 08 May 2020
Learning Parities with Neural Networks Amit Daniely Eran Malach 24 76 0 18 Feb 2020
Convergence of End-to-End Training in Deep Unsupervised Contrastive Learning Zixin Wen SSL 21 2 0 17 Feb 2020
Memory capacity of neural networks with threshold and ReLU activations Roman Vershynin 31 21 0 20 Jan 2020