Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks

24 January 2019

Papers citing "Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks"

50 / 239 papers shown

Title
On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths Quynh N. Nguyen 43 48 0 24 Jan 2021
Reproducing Activation Function for Deep Learning Senwei Liang Liyao Lyu Chunmei Wang Haizhao Yang 36 21 0 13 Jan 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks Asaf Noy Yi Tian Xu Y. Aflalo Lihi Zelnik-Manor R. L. Jin 39 3 0 12 Jan 2021
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks Quynh N. Nguyen Marco Mondelli Guido Montúfar 25 81 0 21 Dec 2020
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning Zeyuan Allen-Zhu Yuanzhi Li FedML 60 355 0 17 Dec 2020
Gradient Starvation: A Learning Proclivity in Neural Networks Mohammad Pezeshki Sekouba Kaba Yoshua Bengio Aaron Courville Doina Precup Guillaume Lajoie MLT 50 257 0 18 Nov 2020
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces Zhuoran Yang Chi Jin Zhaoran Wang Mengdi Wang Michael I. Jordan 39 18 0 09 Nov 2020
A Bayesian Perspective on Training Speed and Model Selection Clare Lyle Lisa Schut Binxin Ru Y. Gal Mark van der Wilk 44 24 0 27 Oct 2020
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks Zhiqi Bu Shiyun Xu Kan Chen 33 17 0 25 Oct 2020
Continual Learning in Low-rank Orthogonal Subspaces Arslan Chaudhry Naeemullah Khan P. Dokania Philip Torr CLL 33 114 0 22 Oct 2020
Deep Learning is Singular, and That's Good Daniel Murfet Susan Wei Biwei Huang Hui Li Jesse Gell-Redman T. Quella UQCV 24 26 0 22 Oct 2020
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher Guangda Ji Zhanxing Zhu 59 42 0 20 Oct 2020
For self-supervised learning, Rationality implies generalization, provably Yamini Bansal Gal Kaplun Boaz Barak OOD SSL 58 22 0 16 Oct 2020
Regularizing Neural Networks via Adversarial Model Perturbation Yaowei Zheng Richong Zhang Yongyi Mao AAML 30 95 0 10 Oct 2020
On the linearity of large non-linear models: when and why the tangent kernel is constant Chaoyue Liu Libin Zhu M. Belkin 21 140 0 02 Oct 2020
Neural Thompson Sampling Weitong Zhang Dongruo Zhou Lihong Li Quanquan Gu 28 114 0 02 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes A. Bietti Francis R. Bach 28 86 0 30 Sep 2020
Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot Jingtong Su Yihang Chen Tianle Cai Tianhao Wu Ruiqi Gao Liwei Wang J. Lee 14 85 0 22 Sep 2020
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS Lin Chen Sheng Xu 30 93 0 22 Sep 2020
Generalized Leverage Score Sampling for Neural Networks J. Lee Ruoqi Shen Zhao Song Mengdi Wang Zheng Yu 21 42 0 21 Sep 2020
Predicting Training Time Without Training L. Zancato Alessandro Achille Avinash Ravichandran Rahul Bhotika Stefano Soatto 26 24 0 28 Aug 2020
Deep Networks and the Multiple Manifold Problem Sam Buchanan D. Gilboa John N. Wright 166 39 0 25 Aug 2020
Multiple Descent: Design Your Own Generalization Curve Lin Chen Yifei Min M. Belkin Amin Karbasi DRL 28 61 0 03 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy Zuyue Fu Zhuoran Yang Zhaoran Wang 15 42 0 02 Aug 2020
The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training Andrea Montanari Yiqiao Zhong 49 95 0 25 Jul 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy E. Moroshko Suriya Gunasekar Blake E. Woodworth J. Lee Nathan Srebro Daniel Soudry 35 85 0 13 Jul 2020
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach Luofeng Liao You-Lin Chen Zhuoran Yang Bo Dai Zhaoran Wang Mladen Kolar 30 32 0 02 Jul 2020
A Revision of Neural Tangent Kernel-based Approaches for Neural Networks Kyungsu Kim A. Lozano Eunho Yang AAML 35 0 0 02 Jul 2020
Generalisation Guarantees for Continual Learning with Orthogonal Gradient Descent Mehdi Abbana Bennani Thang Doan Masashi Sugiyama CLL 50 61 0 21 Jun 2020
Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning Lingxiao Wang Zhuoran Yang Zhaoran Wang 27 26 0 21 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time Jan van den Brand Binghui Peng Zhao Song Omri Weinstein ODL 29 82 0 20 Jun 2020
An analytic theory of shallow networks dynamics for hinge loss classification Franco Pellegrini Giulio Biroli 35 19 0 19 Jun 2020
Exploring Weight Importance and Hessian Bias in Model Pruning Mingchen Li Yahya Sattar Christos Thrampoulidis Samet Oymak 28 3 0 19 Jun 2020
Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains Matthew Tancik Pratul P. Srinivasan B. Mildenhall Sara Fridovich-Keil N. Raghavan Utkarsh Singhal R. Ramamoorthi Jonathan T. Barron Ren Ng 60 2,344 0 18 Jun 2020
When Does Preconditioning Help or Hurt Generalization? S. Amari Jimmy Ba Roger C. Grosse Xuechen Li Atsushi Nitanda Taiji Suzuki Denny Wu Ji Xu 36 32 0 18 Jun 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance Jeff Z. HaoChen Colin Wei J. Lee Tengyu Ma 29 93 0 15 Jun 2020
Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks Kenta Oono Taiji Suzuki AI4CE 37 31 0 15 Jun 2020
Global Attention Improves Graph Networks Generalization Omri Puny Heli Ben-Hamu Y. Lipman 27 22 0 14 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks Patrick Cheridito Arnulf Jentzen Florian Rossmannek 14 37 0 12 Jun 2020
H3DNet: 3D Object Detection Using Hybrid Geometric Primitives Zaiwei Zhang Bo Sun Haitao Yang Qi-Xing Huang 3DPC 20 195 0 10 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory Yufeng Zhang Qi Cai Zhuoran Yang Yongxin Chen Zhaoran Wang OOD MLT 114 11 0 08 Jun 2020
Speedy Performance Estimation for Neural Architecture Search Binxin Ru Clare Lyle Lisa Schut M. Fil Mark van der Wilk Y. Gal 18 36 0 08 Jun 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning Zeyuan Allen-Zhu Yuanzhi Li MLT AAML 37 147 0 20 May 2020
Learning the gravitational force law and other analytic functions Atish Agarwala Abhimanyu Das Rina Panigrahy Qiuyi Zhang MLT 16 0 0 15 May 2020
Compressive sensing with un-trained neural networks: Gradient descent finds the smoothest approximation Reinhard Heckel Mahdi Soltanolkotabi 8 79 0 07 May 2020
Random Features for Kernel Approximation: A Survey on Algorithms, Theory, and Beyond Fanghui Liu Xiaolin Huang Yudong Chen Johan A. K. Suykens BDL 44 172 0 23 Apr 2020
Analysis of Knowledge Transfer in Kernel Regime Arman Rahbar Ashkan Panahi Chiranjib Bhattacharyya Devdatt Dubhashi M. Chehreghani 23 3 0 30 Mar 2020
Frequency Bias in Neural Networks for Input of Non-Uniform Density Ronen Basri Meirav Galun Amnon Geifman David Jacobs Yoni Kasten S. Kritchman 39 183 0 10 Mar 2020
Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation Arnulf Jentzen Timo Welti 22 15 0 03 Mar 2020
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks Chaoyue Liu Libin Zhu M. Belkin ODL 17 247 0 29 Feb 2020