v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018

Aarti Singh

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown

Title
Competition analysis on the over-the-counter credit default swap market L. Abraham 51 1 0 03 Dec 2020
Neural Contextual Bandits with Deep Representation and Shallow Exploration Pan Xu Zheng Wen Handong Zhao Quanquan Gu OffRL 89 78 0 03 Dec 2020
On Generalization of Adaptive Methods for Over-parameterized Linear Regression Vatsal Shah Soumya Basu Anastasios Kyrillidis Sujay Sanghavi AI4CE 59 4 0 28 Nov 2020
Neural collapse with unconstrained features D. Mixon Hans Parshall Jianzong Pi 82 121 0 23 Nov 2020
Metric Transforms and Low Rank Matrices via Representation Theory of the Real Hyperrectangle Josh Alman T. Chu Gary Miller Shyam Narayanan Mark Sellke Zhao Song 38 1 0 23 Nov 2020
Normalization effects on shallow neural networks and related asymptotic expansions Jiahui Yu K. Spiliopoulos 53 6 0 20 Nov 2020
Gradient Starvation: A Learning Proclivity in Neural Networks Mohammad Pezeshki Sekouba Kaba Yoshua Bengio Aaron Courville Doina Precup Guillaume Lajoie MLT 160 269 0 18 Nov 2020
Towards NNGP-guided Neural Architecture Search Daniel S. Park Jaehoon Lee Daiyi Peng Yuan Cao Jascha Narain Sohl-Dickstein BDL 71 34 0 11 Nov 2020
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee Tengyu Xu Yingbin Liang Guanghui Lan 102 128 0 11 Nov 2020
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces Zhuoran Yang Chi Jin Zhaoran Wang Mengdi Wang Michael I. Jordan 97 18 0 09 Nov 2020
Algorithms and Hardness for Linear Algebra on Geometric Graphs Josh Alman T. Chu Aaron Schild Zhao Song 120 30 0 04 Nov 2020
Which Minimizer Does My Neural Network Converge To? Manuel Nonnenmacher David Reeb Ingo Steinwart ODL 32 4 0 04 Nov 2020
Federated Knowledge Distillation Hyowoon Seo Jihong Park Seungeun Oh M. Bennis Seong-Lyun Kim FedML 101 92 0 04 Nov 2020
DebiNet: Debiasing Linear Models with Nonlinear Overparameterized Neural Networks Shiyun Xu Zhiqi Bu 15 1 0 01 Nov 2020
Over-parametrized neural networks as under-determined linear systems Austin R. Benson Anil Damle Alex Townsend 16 0 0 29 Oct 2020
Are wider nets better given the same number of parameters? A. Golubeva Behnam Neyshabur Guy Gur-Ari 112 44 0 27 Oct 2020
Neural Network Approximation: Three Hidden Layers Are Enough Zuowei Shen Haizhao Yang Shijun Zhang 139 121 0 25 Oct 2020
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks Zhiqi Bu Shiyun Xu Kan Chen 70 18 0 25 Oct 2020
On Convergence and Generalization of Dropout Training Poorya Mianjy R. Arora 132 30 0 23 Oct 2020
An Investigation of how Label Smoothing Affects Generalization Blair Chen Liu Ziyin Zihao Wang Paul Pu Liang UQCV 92 18 0 23 Oct 2020
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime Andrea Agazzi Jianfeng Lu 89 16 0 22 Oct 2020
Deep Learning is Singular, and That's Good Daniel Murfet Susan Wei Biwei Huang Hui Li Jesse Gell-Redman T. Quella UQCV 79 29 0 22 Oct 2020
MixCon: Adjusting the Separability of Data Representations for Harder Data Recovery Xiaoxiao Li Yangsibo Huang Binghui Peng Zhao Song Keqin Li MIACV 74 1 0 22 Oct 2020
Beyond Lazy Training for Over-parameterized Tensor Decomposition Xiang Wang Chenwei Wu Jason D. Lee Tengyu Ma Rong Ge 91 14 0 22 Oct 2020
PHEW: Constructing Sparse Networks that Learn Fast and Generalize Well without Training Data S. M. Patil C. Dovrolis 75 18 0 22 Oct 2020
Towards Understanding the Dynamics of the First-Order Adversaries Zhun Deng Hangfeng He Jiaoyang Huang Weijie J. Su AAML 54 11 0 20 Oct 2020
Deep Reinforcement Learning for Adaptive Network Slicing in 5G for Intelligent Vehicular Systems and Smart Cities A. Nassar Y. Yilmaz AI4CE 42 60 0 19 Oct 2020
Adaptive Dense-to-Sparse Paradigm for Pruning Online Recommendation System with Non-Stationary Data Mao Ye Dhruv Choudhary Jiecao Yu Ellie Wen Zeliang Chen Jiyan Yang Jongsoo Park Qiang Liu A. Kejariwal 76 9 0 16 Oct 2020
Temperature check: theory and practice for training models with softmax-cross-entropy losses Atish Agarwala Jeffrey Pennington Yann N. Dauphin S. Schoenholz UQCV 67 34 0 14 Oct 2020
How does Weight Correlation Affect the Generalisation Ability of Deep Neural Networks Gao Jin Xinping Yi Liang Zhang Lijun Zhang S. Schewe Xiaowei Huang 83 42 0 12 Oct 2020
A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix T. Doan Mehdi Abbana Bennani Bogdan Mazoure Guillaume Rabusseau Pierre Alquier CLL 88 86 0 07 Oct 2020
Constraining Logits by Bounded Function for Adversarial Robustness Sekitoshi Kanai Masanori Yamada Shin'ya Yamaguchi Hiroshi Takahashi Yasutoshi Ida AAML 33 4 0 06 Oct 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks Chulhee Yun Shankar Krishnan H. Mobahi MLT 125 82 0 06 Oct 2020
Understanding How Over-Parametrization Leads to Acceleration: A case of learning a single teacher neuron Jun-Kun Wang Jacob D. Abernethy 40 1 0 04 Oct 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network Jun-Kun Wang Chi-Heng Lin Jacob D. Abernethy 76 24 0 04 Oct 2020
Computational Separation Between Convolutional and Fully-Connected Networks Eran Malach Shai Shalev-Shwartz 90 26 0 03 Oct 2020
WeMix: How to Better Utilize Data Augmentation Yi Tian Xu Asaf Noy Ming Lin Qi Qian Hao Li Rong Jin 84 16 0 03 Oct 2020
Interpreting Robust Optimization via Adversarial Influence Functions Zhun Deng Cynthia Dwork Jialiang Wang Linjun Zhang TDI 49 12 0 03 Oct 2020
On the linearity of large non-linear models: when and why the tangent kernel is constant Chaoyue Liu Libin Zhu M. Belkin 169 143 0 02 Oct 2020
Optimization Landscapes of Wide Deep Neural Networks Are Benign Johannes Lederer 99 8 0 02 Oct 2020
Neural Thompson Sampling Weitong Zhang Dongruo Zhou Lihong Li Quanquan Gu 87 122 0 02 Oct 2020
Why Adversarial Interaction Creates Non-Homogeneous Patterns: A Pseudo-Reaction-Diffusion Model for Turing Instability Litu Rout AAML 34 1 0 01 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes A. Bietti Francis R. Bach 113 90 0 30 Sep 2020
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks Keyulu Xu Mozhi Zhang Jingling Li S. Du Ken-ichi Kawarabayashi Stefanie Jegelka MLT 184 313 0 24 Sep 2020
Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't E. Weinan Chao Ma Stephan Wojtowytsch Lei Wu AI4CE 125 134 0 22 Sep 2020
Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot Jingtong Su Yihang Chen Tianle Cai Tianhao Wu Ruiqi Gao Liwei Wang Jason D. Lee 73 86 0 22 Sep 2020
Tensor Programs III: Neural Matrix Laws Greg Yang 79 48 0 22 Sep 2020
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS Lin Chen Sheng Xu 196 94 0 22 Sep 2020
Generalized Leverage Score Sampling for Neural Networks Jason D. Lee Ruoqi Shen Zhao Song Mengdi Wang Zheng Yu 71 43 0 21 Sep 2020
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training Tianle Cai Shengjie Luo Keyulu Xu Di He Tie-Yan Liu Liwei Wang GNN 108 167 0 07 Sep 2020