Deep Learning without Poor Local Minima

23 May 2016

Papers citing "Deep Learning without Poor Local Minima"

50 / 207 papers shown

Title
Gradients are Not All You Need Luke Metz C. Freeman S. Schoenholz Tal Kachman 30 93 0 10 Nov 2021
Mode connectivity in the loss landscape of parameterized quantum circuits Kathleen E. Hamilton E. Lynn R. Pooser 32 3 0 09 Nov 2021
Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations Jiayao Zhang Hua Wang Weijie J. Su 35 8 0 11 Oct 2021
Towards Demystifying Representation Learning with Non-contrastive Self-supervision Xiang Wang Xinlei Chen S. Du Yuandong Tian SSL 21 26 0 11 Oct 2021
On the Global Convergence of Gradient Descent for multi-layer ResNets in the mean-field regime Zhiyan Ding Shi Chen Qin Li S. Wright MLT AI4CE 43 11 0 06 Oct 2021
Exponentially Many Local Minima in Quantum Neural Networks Xuchen You Xiaodi Wu 72 52 0 06 Oct 2021
Perturbated Gradients Updating within Unit Space for Deep Learning Ching-Hsun Tseng Liu Cheng Shin-Jye Lee Xiaojun Zeng 45 5 0 01 Oct 2021
Constants of Motion: The Antidote to Chaos in Optimization and Game Dynamics Georgios Piliouras Xiao Wang 39 0 0 08 Sep 2021
Impact of GPU uncertainty on the training of predictive deep neural networks Maciej Pietrowski A. Gajda Takuto Yamamoto Taisuke Kobayashi Lana Sinapayen Eiji Watanabe BDL 19 0 0 03 Sep 2021
The loss landscape of deep linear neural networks: a second-order analysis El Mehdi Achour Franccois Malgouyres Sébastien Gerchinovitz ODL 26 9 0 28 Jul 2021
Can we globally optimize cross-validation loss? Quasiconvexity in ridge regression William T. Stephenson Zachary Frangella Madeleine Udell Tamara Broderick 24 12 0 19 Jul 2021
Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed Number of Neurons Zuowei Shen Haizhao Yang Shijun Zhang 56 36 0 06 Jul 2021
Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity Shiwei Liu Tianlong Chen Zahra Atashgahi Xiaohan Chen Ghada Sokar Elena Mocanu Mykola Pechenizkiy Zhangyang Wang Decebal Constantin Mocanu OOD 31 49 0 28 Jun 2021
AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly Yuchen Jin Dinesh Manocha Liangyu Zhao Yibo Zhu Chuanxiong Guo Marco Canini Arvind Krishnamurthy 37 18 0 22 May 2021
Structured Ensembles: an Approach to Reduce the Memory Footprint of Ensemble Methods Jary Pomponi Simone Scardapane A. Uncini UQCV 49 7 0 06 May 2021
A Geometric Analysis of Neural Collapse with Unconstrained Features Zhihui Zhu Tianyu Ding Jinxin Zhou Xiao Li Chong You Jeremias Sulam Qing Qu 40 196 0 06 May 2021
Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions Patrick Cheridito Arnulf Jentzen Florian Rossmannek 24 10 0 19 Mar 2021
Optimal Approximation Rate of ReLU Networks in terms of Width and Depth Zuowei Shen Haizhao Yang Shijun Zhang 103 115 0 28 Feb 2021
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization Tianyi Liu Yan Li S. Wei Enlu Zhou T. Zhao 21 13 0 24 Feb 2021
Understanding self-supervised Learning Dynamics without Contrastive Pairs Yuandong Tian Xinlei Chen Surya Ganguli SSL 138 281 0 12 Feb 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks Asaf Noy Yi Tian Xu Y. Aflalo Lihi Zelnik-Manor Rong Jin 41 3 0 12 Jan 2021
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning Zeyuan Allen-Zhu Yuanzhi Li FedML 60 356 0 17 Dec 2020
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics D. Kunin Javier Sagastuy-Breña Surya Ganguli Daniel L. K. Yamins Hidenori Tanaka 107 77 0 08 Dec 2020
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization Adepu Ravi Sankar Yash Khasbage Rahul Vigneswaran V. Balasubramanian 25 42 0 07 Dec 2020
Learning Graph Neural Networks with Approximate Gradient Descent Qunwei Li Shaofeng Zou Leon Wenliang Zhong GNN 32 1 0 07 Dec 2020
Convergence Proof for Actor-Critic Methods Applied to PPO and RUDDER Markus Holzleitner Lukas Gruber Jose A. Arjona-Medina Johannes Brandstetter Sepp Hochreiter 33 38 0 02 Dec 2020
Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't E. Weinan Chao Ma Stephan Wojtowytsch Lei Wu AI4CE 24 133 0 22 Sep 2020
It's Hard for Neural Networks To Learn the Game of Life Jacob Mitchell Springer Garrett Kenyon 27 21 0 03 Sep 2020
Rethinking CNN Models for Audio Classification Kamalesh Palanisamy Dipika Singhania Angela Yao SSL 33 144 0 22 Jul 2020
Sparse Linear Networks with a Fixed Butterfly Structure: Theory and Practice Nir Ailon Omer Leibovitch Vineet Nair 15 14 0 17 Jul 2020
Quantitative Propagation of Chaos for SGD in Wide Neural Networks Valentin De Bortoli Alain Durmus Xavier Fontaine Umut Simsekli 32 25 0 13 Jul 2020
A Generative Neural Network Framework for Automated Software Testing Leonid Joffe David J. Clark 30 2 0 29 Jun 2020
The Depth-to-Width Interplay in Self-Attention Yoav Levine Noam Wies Or Sharir Hofit Bata Amnon Shashua 30 45 0 22 Jun 2020
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation Robert Mansel Gower Othmane Sebbouh Nicolas Loizou 25 74 0 18 Jun 2020
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training Diego Granziol S. Zohren Stephen J. Roberts ODL 37 49 0 16 Jun 2020
On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them Chen Liu Mathieu Salzmann Tao R. Lin Ryota Tomioka Sabine Süsstrunk AAML 24 81 0 15 Jun 2020
An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias Lu Yu Krishnakumar Balasubramanian S. Volgushev Murat A. Erdogdu 35 50 0 14 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks Patrick Cheridito Arnulf Jentzen Florian Rossmannek 14 37 0 12 Jun 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning Zeyuan Allen-Zhu Yuanzhi Li MLT AAML 39 147 0 20 May 2020
Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled Gradient Descent Tian Tong Cong Ma Yuejie Chi 29 115 0 18 May 2020
Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers Junjie Liu Zhe Xu Runbin Shi R. Cheung Hayden Kwok-Hay So 17 119 0 14 May 2020
Orthogonal Over-Parameterized Training Weiyang Liu Rongmei Lin Zhen Liu James M. Rehg Liam Paull Li Xiong Le Song Adrian Weller 32 41 0 09 Apr 2020
The Landscape of Matrix Factorization Revisited Hossein Valavi Sulin Liu Peter J. Ramadge 17 5 0 27 Feb 2020
BatchEnsemble: An Alternative Approach to Efficient Ensemble and Lifelong Learning Yeming Wen Dustin Tran Jimmy Ba OOD FedML UQCV 32 483 0 17 Feb 2020
FEA-Net: A Physics-guided Data-driven Model for Efficient Mechanical Response Prediction Houpu Yao Yi Gao Yongming Liu AI4CE 61 66 0 31 Jan 2020
Thresholds of descending algorithms in inference problems Stefano Sarao Mannelli Lenka Zdeborova AI4CE 24 4 0 02 Jan 2020
Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity Shiyu Liang Ruoyu Sun R. Srikant 37 19 0 31 Dec 2019
The Usual Suspects? Reassessing Blame for VAE Posterior Collapse Bin Dai Ziyu Wang David Wipf DRL 24 75 0 23 Dec 2019
Optimization for deep learning: theory and algorithms Ruoyu Sun ODL 27 168 0 19 Dec 2019
Information-Theoretic Local Minima Characterization and Regularization Zhiwei Jia Hao Su 27 19 0 19 Nov 2019