v1v2v3v4 (latest)

Gradient Descent Finds Global Minima of Deep Neural Networks

9 November 2018

Papers citing "Gradient Descent Finds Global Minima of Deep Neural Networks"

50 / 466 papers shown

Title
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces Zhuoran Yang Chi Jin Zhaoran Wang Mengdi Wang Michael I. Jordan 97 18 0 09 Nov 2020
Which Minimizer Does My Neural Network Converge To? Manuel Nonnenmacher David Reeb Ingo Steinwart ODL 32 4 0 04 Nov 2020
DebiNet: Debiasing Linear Models with Nonlinear Overparameterized Neural Networks Shiyun Xu Zhiqi Bu 15 1 0 01 Nov 2020
Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel Stanislav Fort Gintare Karolina Dziugaite Mansheej Paul Sepideh Kharaghani Daniel M. Roy Surya Ganguli 114 193 0 28 Oct 2020
Wearing a MASK: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels Sina Alemohammad Hossein Babaei Randall Balestriero Matt Y. Cheung Ahmed Imtiaz Humayun ... Naiming Liu Lorenzo Luzi Jasper Tan Zichao Wang Richard G. Baraniuk 27 5 0 27 Oct 2020
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks Zhiqi Bu Shiyun Xu Kan Chen 61 18 0 25 Oct 2020
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime Andrea Agazzi Jianfeng Lu 84 16 0 22 Oct 2020
Beyond Lazy Training for Over-parameterized Tensor Decomposition Xiang Wang Chenwei Wu Jason D. Lee Tengyu Ma Rong Ge 91 14 0 22 Oct 2020
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher Guangda Ji Zhanxing Zhu 102 44 0 20 Oct 2020
Deep Reinforcement Learning for Adaptive Network Slicing in 5G for Intelligent Vehicular Systems and Smart Cities A. Nassar Y. Yilmaz AI4CE 42 60 0 19 Oct 2020
Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning Pan Zhou Jiashi Feng Chao Ma Caiming Xiong Guosheng Lin E. Weinan 104 235 0 12 Oct 2020
Constraining Logits by Bounded Function for Adversarial Robustness Sekitoshi Kanai Masanori Yamada Shin'ya Yamaguchi Hiroshi Takahashi Yasutoshi Ida AAML 28 4 0 06 Oct 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks Chulhee Yun Shankar Krishnan H. Mobahi MLT 125 82 0 06 Oct 2020
WeMix: How to Better Utilize Data Augmentation Yi Tian Xu Asaf Noy Ming Lin Qi Qian Hao Li Rong Jin 84 16 0 03 Oct 2020
On the linearity of large non-linear models: when and why the tangent kernel is constant Chaoyue Liu Libin Zhu M. Belkin 169 143 0 02 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes A. Bietti Francis R. Bach 110 90 0 30 Sep 2020
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks Keyulu Xu Mozhi Zhang Jingling Li S. Du Ken-ichi Kawarabayashi Stefanie Jegelka MLT 184 313 0 24 Sep 2020
Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot Jingtong Su Yihang Chen Tianle Cai Tianhao Wu Ruiqi Gao Liwei Wang Jason D. Lee 73 86 0 22 Sep 2020
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS Lin Chen Sheng Xu 193 94 0 22 Sep 2020
Kernel-Based Smoothness Analysis of Residual Networks Tom Tirer Joan Bruna Raja Giryes 82 20 0 21 Sep 2020
Generalized Leverage Score Sampling for Neural Networks Jason D. Lee Ruoqi Shen Zhao Song Mengdi Wang Zheng Yu 66 43 0 21 Sep 2020
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training Tianle Cai Shengjie Luo Keyulu Xu Di He Tie-Yan Liu Liwei Wang GNN 104 167 0 07 Sep 2020
It's Hard for Neural Networks To Learn the Game of Life Jacob Mitchell Springer Garrett Kenyon 85 21 0 03 Sep 2020
Predicting Training Time Without Training Luca Zancato Alessandro Achille Avinash Ravichandran Rahul Bhotika Stefano Soatto 156 24 0 28 Aug 2020
A Dynamical Central Limit Theorem for Shallow Neural Networks Zhengdao Chen Grant M. Rotskoff Joan Bruna Eric Vanden-Eijnden 87 30 0 21 Aug 2020
Asymptotics of Wide Convolutional Neural Networks Anders Andreassen Ethan Dyer 74 23 0 19 Aug 2020
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization Ben Adlam Jeffrey Pennington 61 125 0 15 Aug 2020
On the Generalization Properties of Adversarial Training Yue Xing Qifan Song Guang Cheng AAML 78 34 0 15 Aug 2020
Adversarial Training and Provable Robustness: A Tale of Two Objectives Jiameng Fan Wenchao Li AAML 51 21 0 13 Aug 2020
Multiple Descent: Design Your Own Generalization Curve Lin Chen Yifei Min M. Belkin Amin Karbasi DRL 162 61 0 03 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy Zuyue Fu Zhuoran Yang Zhaoran Wang 87 43 0 02 Aug 2020
Finite Versus Infinite Neural Networks: an Empirical Study Jaehoon Lee S. Schoenholz Jeffrey Pennington Ben Adlam Lechao Xiao Roman Novak Jascha Narain Sohl-Dickstein 87 214 0 31 Jul 2020
On the Banach spaces associated with multi-layer ReLU networks: Function representation, approximation theory and gradient descent dynamics E. Weinan Stephan Wojtowytsch MLT 67 53 0 30 Jul 2020
Universality of Gradient Descent Neural Network Training G. Welper 62 8 0 27 Jul 2020
Early Stopping in Deep Networks: Double Descent and How to Eliminate it Reinhard Heckel Fatih Yilmaz 80 45 0 20 Jul 2020
Deep PQR: Solving Inverse Reinforcement Learning using Anchor Actions Sinong Geng Houssam Nassif Carlos A. Manzanares A. M. Reppen R. Sircar 43 2 0 15 Jul 2020
From Symmetry to Geometry: Tractable Nonconvex Problems Yuqian Zhang Qing Qu John N. Wright 87 45 0 14 Jul 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy E. Moroshko Suriya Gunasekar Blake E. Woodworth Jason D. Lee Nathan Srebro Daniel Soudry 89 86 0 13 Jul 2020
Maximum-and-Concatenation Networks Xingyu Xie Hao Kong Jianlong Wu Wayne Zhang Guangcan Liu Zhouchen Lin 157 2 0 09 Jul 2020
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK Yuanzhi Li Tengyu Ma Hongyang R. Zhang MLT 95 27 0 09 Jul 2020
Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH) Yuqing Li Yaoyu Zhang N. Yip 55 5 0 07 Jul 2020
DessiLBI: Exploring Structural Sparsity of Deep Networks via Differential Inclusion Paths Yanwei Fu Chen Liu Donghao Li Xinwei Sun Jinshan Zeng Yuan Yao 34 9 0 04 Jul 2020
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders Yibo Jiang Cengiz Pehlevan 61 14 0 30 Jun 2020
Two-Layer Neural Networks for Partial Differential Equations: Optimization and Generalization Theory Yaoyu Zhang Haizhao Yang 76 75 0 28 Jun 2020
Global Convergence and Generalization Bound of Gradient-Based Meta-Learning with Deep Neural Nets Haoxiang Wang Ruoyu Sun Bo Li MLT AI4CE 82 14 0 25 Jun 2020
The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models Chao Ma Lei Wu E. Weinan MLT 121 11 0 25 Jun 2020
Towards Understanding Hierarchical Learning: Benefits of Neural Representations Minshuo Chen Yu Bai Jason D. Lee T. Zhao Huan Wang Caiming Xiong R. Socher SSL 91 49 0 24 Jun 2020
When Do Neural Networks Outperform Kernel Methods? Behrooz Ghorbani Song Mei Theodor Misiakiewicz Andrea Montanari 128 189 0 24 Jun 2020
On the Global Optimality of Model-Agnostic Meta-Learning Lingxiao Wang Qi Cai Zhuoran Yang Zhaoran Wang 71 44 0 23 Jun 2020
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime Atsushi Nitanda Taiji Suzuki 79 41 0 22 Jun 2020