v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018

Aarti Singh

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown

Title
Finite Sample Identification of Wide Shallow Neural Networks with Biases M. Fornasier T. Klock Marco Mondelli Michael Rauchensteiner 54 6 0 08 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks Zhengdao Chen Eric Vanden-Eijnden Joan Bruna MLT 77 5 0 28 Oct 2022
LOFT: Finding Lottery Tickets through Filter-wise Training Qihan Wang Chen Dun Fangshuo Liao C. Jermaine Anastasios Kyrillidis 69 3 0 28 Oct 2022
Sparsity in Continuous-Depth Neural Networks H. Aliee Till Richter Mikhail Solonin I. Ibarra Fabian J. Theis Niki Kilbertus 97 11 0 26 Oct 2022
Pushing the Efficiency Limit Using Structured Sparse Convolutions Vinay Kumar Verma Nikhil Mehta Shijing Si Ricardo Henao Lawrence Carin 50 3 0 23 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets Pulkit Gopalani Anirbit Mukherjee 69 6 0 20 Oct 2022
Theoretical Guarantees for Permutation-Equivariant Quantum Neural Networks Louis Schatzki Martín Larocca Quynh T. Nguyen F. Sauvage M. Cerezo 109 92 0 18 Oct 2022
Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data Spencer Frei Gal Vardi Peter L. Bartlett Nathan Srebro Wei Hu MLT 83 42 0 13 Oct 2022
Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence Diyuan Wu Vyacheslav Kungurtsev Marco Mondelli 60 3 0 13 Oct 2022
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent Satyen Kale Jason D. Lee Chris De Sa Ayush Sekhari Karthik Sridharan 49 4 0 13 Oct 2022
Few-shot Backdoor Attacks via Neural Tangent Kernels J. Hayase Sewoong Oh 74 21 0 12 Oct 2022
Toward Sustainable Continual Learning: Detection and Knowledge Repurposing of Similar Tasks Sijia Wang Yoojin Choi Junya Chen Mostafa El-Khamy Ricardo Henao CLL 64 0 0 11 Oct 2022
A Kernel-Based View of Language Model Fine-Tuning Sadhika Malladi Alexander Wettig Dingli Yu Danqi Chen Sanjeev Arora VLM 157 69 0 11 Oct 2022
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness? Nikolaos Tsilivis Julia Kempe AAML 98 20 0 11 Oct 2022
Efficient NTK using Dimensionality Reduction Nir Ailon Supratim Shit 95 0 0 10 Oct 2022
On skip connections and normalisation layers in deep optimisation L. MacDonald Jack Valmadre Hemanth Saratchandran Simon Lucey ODL 74 2 0 10 Oct 2022
Dynamical Isometry for Residual Networks Advait Gadhikar R. Burkholz ODL AI4CE 81 2 0 05 Oct 2022
Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees Siliang Zeng Mingyi Hong Alfredo García OffRL 83 12 0 04 Oct 2022
Plateau in Monotonic Linear Interpolation -- A "Biased" View of Loss Landscape for Deep Networks Xiang Wang Annie Wang Mo Zhou Rong Ge MoMe 231 10 0 03 Oct 2022
A Combinatorial Perspective on the Optimization of Shallow ReLU Networks Michael Matena Colin Raffel 40 1 0 01 Oct 2022
On the optimization and generalization of overparameterized implicit neural networks Tianxiang Gao Hongyang Gao MLT AI4CE 65 3 0 30 Sep 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD Alireza Mousavi-Hosseini Sejun Park M. Girotti Ioannis Mitliagkas Murat A. Erdogdu MLT 379 50 0 29 Sep 2022
Stability and Generalization Analysis of Gradient Methods for Shallow Neural Networks Yunwen Lei Rong Jin Yiming Ying MLT 100 19 0 19 Sep 2022
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty Thomas George Guillaume Lajoie A. Baratin 87 6 0 19 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in $1d$ R. Gentile G. Welper ODL 104 7 0 17 Sep 2022
Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study Yongtao Wu Zhenyu Zhu Fanghui Liu Grigorios G. Chrysos Volkan Cevher 85 11 0 16 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization) Zhenyu Zhu Fanghui Liu Grigorios G. Chrysos Volkan Cevher 104 21 0 15 Sep 2022
Generalization Properties of NAS under Activation and Skip Connection Search Zhenyu Zhu Fanghui Liu Grigorios G. Chrysos Volkan Cevher AI4CE 100 17 0 15 Sep 2022
On the Trade-Off between Actionable Explanations and the Right to be Forgotten Martin Pawelczyk Tobias Leemann Asia J. Biega Gjergji Kasneci FaML MU 109 23 0 30 Aug 2022
Neural Tangent Kernel: A Survey Eugene Golikov Eduard Pokonechnyy Vladimir Korviakov 76 14 0 29 Aug 2022
Overparameterization from Computational Constraints Sanjam Garg S. Jha Saeed Mahloujifar Mohammad Mahmoody Mingyuan Wang 47 2 0 27 Aug 2022
Universal Solutions of Feedforward ReLU Networks for Interpolations Changcun Huang 63 2 0 16 Aug 2022
Gaussian Process Surrogate Models for Neural Networks Michael Y. Li Erin Grant Thomas Griffiths BDL SyDa 109 8 0 11 Aug 2022
A Sublinear Adversarial Training Algorithm Yeqi Gao Lianke Qin Zhao Song Yitan Wang GAN 77 25 0 10 Aug 2022
Training Overparametrized Neural Networks in Sublinear Time Yichuan Deng Han Hu Zhao Song Omri Weinstein Danyang Zhuo BDL 99 28 0 09 Aug 2022
Neural Set Function Extensions: Learning with Discrete Functions in High Dimensions Nikolaos Karalias Joshua Robinson Andreas Loukas Stefanie Jegelka 128 9 0 08 Aug 2022
Provable Acceleration of Nesterov's Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks Xin Liu Wei Tao Wei Li Dazhi Zhan Jun Wang Zhisong Pan ODL 84 1 0 08 Aug 2022
On Fast Simulation of Dynamical System with Neural Vector Enhanced Numerical Solver Zhongzhan Huang Senwei Liang Hong Zhang Haizhao Yang Liang Lin AI4CE 101 9 0 07 Aug 2022
Federated Adversarial Learning: A Framework with Convergence Analysis Xiaoxiao Li Zhao Song Jiaming Yang FedML 92 21 0 07 Aug 2022
Towards Understanding Mixture of Experts in Deep Learning Zixiang Chen Yihe Deng Yue-bo Wu Quanquan Gu Yuan-Fang Li MLT MoE 97 60 0 04 Aug 2022
Agnostic Learning of General ReLU Activation Using Gradient Descent Pranjal Awasthi Alex K. Tang Aravindan Vijayaraghavan MLT 64 7 0 04 Aug 2022
Gradient descent provably escapes saddle points in the training of shallow ReLU networks Patrick Cheridito Arnulf Jentzen Florian Rossmannek 108 5 0 03 Aug 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability Z. Li Zixuan Wang Jian Li 97 47 0 26 Jul 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit Boaz Barak Benjamin L. Edelman Surbhi Goel Sham Kakade Eran Malach Cyril Zhang 114 133 0 18 Jul 2022
The Lottery Ticket Hypothesis for Self-attention in Convolutional Neural Network Zhongzhan Huang Senwei Liang Mingfu Liang Wei He Haizhao Yang Liang Lin 72 9 0 16 Jul 2022
Riemannian Natural Gradient Methods Jiang Hu Ruicheng Ao Anthony Man-Cho So Minghan Yang Zaiwen Wen 67 11 0 15 Jul 2022
Efficient Augmentation for Imbalanced Deep Learning Damien Dablain C. Bellinger Bartosz Krawczyk Nitesh Chawla 66 7 0 13 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent Zhiyuan Li Tianhao Wang Jason D. Lee Sanjeev Arora 110 29 0 08 Jul 2022
Learning and generalization of one-hidden-layer neural networks, going beyond standard Gaussian data Hongkang Li Shuai Zhang Ming Wang MLT 75 8 0 07 Jul 2022
Neural Stein critics with staged $L^2$ -regularization Matthew Repasky Xiuyuan Cheng Yao Xie 61 3 0 07 Jul 2022