v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018

Aarti Singh

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown

Title
Neural Networks can Learn Representations with Gradient Descent Alexandru Damian Jason D. Lee Mahdi Soltanolkotabi SSL MLT 102 123 0 30 Jun 2022
Theoretical Perspectives on Deep Learning Methods in Inverse Problems Jonathan Scarlett Reinhard Heckel M. Rodrigues Paul Hand Yonina C. Eldar AI4CE 110 32 0 29 Jun 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis Alexander Munteanu Simon Omlor Zhao Song David P. Woodruff 97 15 0 26 Jun 2022
Learning sparse features can lead to overfitting in neural networks Leonardo Petrini Francesco Cagnetta Eric Vanden-Eijnden Matthieu Wyart MLT 101 26 0 24 Jun 2022
Optical Flow Regularization of Implicit Neural Representations for Video Frame Interpolation Weihao Zhuang T. Hascoet R. Takashima T. Takiguchi 75 3 0 22 Jun 2022
Limitations of the NTK for Understanding Generalization in Deep Learning Nikhil Vyas Yamini Bansal Preetum Nakkiran 116 34 0 20 Jun 2022
Adversarial Robustness is at Odds with Lazy Training Yunjuan Wang Enayat Ullah Poorya Mianjy R. Arora SILM AAML 114 11 0 18 Jun 2022
Large-width asymptotics for ReLU neural networks with $α$ -Stable initializations Stefano Favaro S. Fortini Stefano Peluchetti 50 2 0 16 Jun 2022
Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks Kaiqi Zhang Ming Yin Yu Wang MQ 74 5 0 13 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms Lam M. Nguyen Trang H. Tran 70 2 0 13 Jun 2022
Analysis of Branch Specialization and its Application in Image Decomposition Jonathan Brokman Guy Gilboa 59 2 0 12 Jun 2022
Gradient Boosting Performs Gaussian Process Inference Aleksei Ustimenko Artem Beliakov Liudmila Prokhorenkova BDL 79 5 0 11 Jun 2022
Parameter Convex Neural Networks Jingcheng Zhou Wei Wei Xing Li Bowen Pang Zhiming Zheng 21 0 0 11 Jun 2022
Neural Collapse: A Review on Modelling Principles and Generalization Vignesh Kothapalli 158 82 0 08 Jun 2022
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials Eshaan Nichani Yunzhi Bai Jason D. Lee 85 10 0 08 Jun 2022
Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime Benjamin Bowman Guido Montúfar 82 15 0 06 Jun 2022
Non-convex online learning via algorithmic equivalence Udaya Ghai Zhou Lu Elad Hazan 94 11 0 30 May 2022
Long-Tailed Learning Requires Feature Learning T. Laurent J. V. Brecht Xavier Bresson VLM 93 1 0 29 May 2022
Global Convergence of Over-parameterized Deep Equilibrium Models Zenan Ling Xingyu Xie Qiuhao Wang Zongpeng Zhang Zhouchen Lin 97 12 0 27 May 2022
A Framework for Overparameterized Learning Dávid Terjék Diego González-Sánchez MLT 50 1 0 26 May 2022
Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width Hanxu Zhou Qixuan Zhou Zhenyuan Jin Yaoyu Zhang Yaoyu Zhang Zhi-Qin John Xu 57 22 0 24 May 2022
Quadratic models for understanding catapult dynamics of neural networks Libin Zhu Chaoyue Liu Adityanarayanan Radhakrishnan M. Belkin 96 14 0 24 May 2022
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture Libin Zhu Chaoyue Liu M. Belkin GNN AI4CE 62 4 0 24 May 2022
Memorization and Optimization in Deep Neural Networks with Minimum Over-parameterization Simone Bombari Mohammad Hossein Amani Marco Mondelli 87 26 0 20 May 2022
Mean-Field Analysis of Two-Layer Neural Networks: Global Optimality with Linear Convergence Rates Jingwei Zhang Xunpeng Huang Jincheng Yu MLT 54 1 0 19 May 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias Itay Safran Gal Vardi Jason D. Lee MLT 109 24 0 18 May 2022
Trading Positional Complexity vs. Deepness in Coordinate Networks Jianqiao Zheng Sameera Ramasinghe Xueqian Li Simon Lucey 101 19 0 18 May 2022
Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep Neural Network, a Survey Paul Wimmer Jens Mehnert Alexandru Paul Condurache DD 98 21 0 17 May 2022
Gradient Descent Optimizes Infinite-Depth ReLU Implicit Networks with Linear Widths Tianxiang Gao Hongyang Gao MLT 79 5 0 16 May 2022
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis Wuyang Chen Wei-Ping Huang Xinyu Gong Boris Hanin Zhangyang Wang 92 7 0 11 May 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation Jimmy Ba Murat A. Erdogdu Taiji Suzuki Zhichao Wang Denny Wu Greg Yang MLT 99 129 0 03 May 2022
Dynamic Programming in Rank Space: Scaling Structured Inference with Low-Rank HMMs and PCFGs Aaron Courville Wei Liu Kewei Tu 110 9 0 01 May 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes Chao Ma D. Kunin Lei Wu Lexing Ying 100 30 0 24 Apr 2022
Spectrum of inner-product kernel matrices in the polynomial regime and multiple descent phenomenon in kernel ridge regression Theodor Misiakiewicz 67 40 0 21 Apr 2022
Theory of Graph Neural Networks: Representation and Learning Stefanie Jegelka GNN AI4CE 93 70 0 16 Apr 2022
On Convergence Lemma and Convergence Stability for Piecewise Analytic Functions Xiaotie Deng Hanyu Li Ningyuan Li 66 0 0 04 Apr 2022
Convergence of gradient descent for deep neural networks S. Chatterjee ODL 77 22 0 30 Mar 2022
Random matrix analysis of deep neural network weight matrices M. Thamm Max Staats B. Rosenow 76 13 0 28 Mar 2022
On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks Hongru Yang Zhangyang Wang MLT 106 8 0 27 Mar 2022
Modality Competition: What Makes Joint Training of Multi-modal Network Fail in Deep Learning? (Provably) Yu Huang Junyang Lin Chang Zhou Hongxia Yang Longbo Huang 66 97 0 23 Mar 2022
On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes Elvis Dohmatob A. Bietti AAML 85 13 0 22 Mar 2022
On the Generalization Mystery in Deep Learning S. Chatterjee Piotr Zielinski OOD 77 35 0 18 Mar 2022
On the Convergence of Certified Robust Training with Interval Bound Propagation Yihan Wang Zhouxing Shi Quanquan Gu Cho-Jui Hsieh 62 9 0 16 Mar 2022
$Variational inference of fractional Brownian motion with linear computational complexity$ Variational inference of fractional Brownian motion with linear computational complexity Hippolyte Verdier Franccois Laurent Alhassan Cassé Christian L. Vestergaard Jean-Baptiste Masson 196 8 0 15 Mar 2022
Deep Regression Ensembles Antoine Didisheim Bryan Kelly Semyon Malamud UQCV 59 4 0 10 Mar 2022
Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models Chaoyue Liu Libin Zhu M. Belkin 53 4 0 10 Mar 2022
Covariate-Balancing-Aware Interpretable Deep Learning models for Treatment Effect Estimation Kan Chen Qishuo Yin Q. Long CML 81 5 0 07 Mar 2022
The Spectral Bias of Polynomial Neural Networks Moulik Choraria L. Dadi Grigorios G. Chrysos Julien Mairal Volkan Cevher 92 20 0 27 Feb 2022
Sparse Neural Additive Model: Interpretable Deep Learning with Feature Selection via Group Sparsity Shiyun Xu Zhiqi Bu Pratik Chaudhari Ian Barnett 86 23 0 25 Feb 2022
Benefit of Interpolation in Nearest Neighbor Algorithms Yue Xing Qifan Song Guang Cheng 91 30 0 23 Feb 2022