Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II

21 July 2021

Papers citing "Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II"

41 / 41 papers shown

Title
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence Berfin Simsek Amire Bendjeddou Daniel Hsu 104 0 0 13 Nov 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding Zhengqing Wu Berfin Simsek Francois Ged ODL 64 0 0 08 Feb 2024
Symmetry & Critical Points for Symmetric Tensor Decomposition Problems Yossi Arjevani Gal Vinograd 49 5 0 13 Jun 2023
Equivariant bifurcation, quadratic equivariants, and symmetry breaking for the standard representation of $S_n$ Yossi Arjevani M. Field 45 8 0 06 Jul 2021
On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting Shunta Akiyama Taiji Suzuki MLT 100 13 0 11 Jun 2021
Symmetry Breaking in Symmetric Tensor Decomposition Yossi Arjevani Joan Bruna M. Field Joe Kileel Matthew Trager Francis Williams 49 8 0 10 Mar 2021
Nonparametric Learning of Two-Layer ReLU Residual Units Zhunxuan Wang Linyun He Chunchuan Lyu Shay B. Cohen MLT OffRL 97 1 0 17 Aug 2020
Analytic Characterization of the Hessian in Shallow ReLU Models: A Tale of Symmetry Yossi Arjevani M. Field 30 16 0 04 Aug 2020
Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions Stefano Sarao Mannelli Eric Vanden-Eijnden Lenka Zdeborová AI4CE 21 46 0 27 Jun 2020
When Do Neural Networks Outperform Kernel Methods? Behrooz Ghorbani Song Mei Theodor Misiakiewicz Andrea Montanari 74 188 0 24 Jun 2020
Symmetry & critical points for a model shallow neural network Yossi Arjevani M. Field 55 13 0 23 Mar 2020
On the Principle of Least Symmetry Breaking in Shallow ReLU Models Yossi Arjevani M. Field 46 7 0 26 Dec 2019
Hidden Unit Specialization in Layered Neural Networks: ReLU vs. Sigmoidal Activation Elisa Oostwal Michiel Straat Michael Biehl MLT 73 56 0 16 Oct 2019
Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup Sebastian Goldt Madhu S. Advani Andrew M. Saxe Florent Krzakala Lenka Zdeborová MLT 91 142 0 18 Jun 2019
On the Power and Limitations of Random Features for Understanding Neural Networks Gilad Yehudai Ohad Shamir MLT 49 182 0 01 Apr 2019
An Investigation into Neural Net Optimization via Hessian Eigenvalue Density Behrooz Ghorbani Shankar Krishnan Ying Xiao ODL 36 320 0 29 Jan 2019
Generalisation dynamics of online learning in over-parameterised neural networks Sebastian Goldt Madhu S. Advani Andrew M. Saxe Florent Krzakala Lenka Zdeborová 37 14 0 25 Jan 2019
Gradient Descent Happens in a Tiny Subspace Guy Gur-Ari Daniel A. Roberts Ethan Dyer 59 232 0 12 Dec 2018
The committee machine: Computational to statistical gaps in learning a two-layers neural network Benjamin Aubin Antoine Maillard Jean Barbier Florent Krzakala N. Macris Lenka Zdeborová 60 105 0 14 Jun 2018
Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach Ryo Karakida S. Akaho S. Amari FedML 106 142 0 04 Jun 2018
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport Lénaïc Chizat Francis R. Bach OT 150 731 0 24 May 2018
Measuring the Intrinsic Dimension of Objective Landscapes Chunyuan Li Heerad Farkhoor Rosanne Liu J. Yosinski 66 407 0 24 Apr 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks Song Mei Andrea Montanari Phan-Minh Nguyen MLT 69 855 0 18 Apr 2018
Hessian-based Analysis of Large Batch Training and Robustness to Adversaries Z. Yao A. Gholami Qi Lei Kurt Keutzer Michael W. Mahoney 51 166 0 22 Feb 2018
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks Itay Safran Ohad Shamir 112 263 0 24 Dec 2017
Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima S. Du Jason D. Lee Yuandong Tian Barnabás Póczós Aarti Singh MLT 120 236 0 03 Dec 2017
Three Factors Influencing Minima in SGD Stanislaw Jastrzebski Zachary Kenton Devansh Arpit Nicolas Ballas Asja Fischer Yoshua Bengio Amos Storkey 67 458 0 13 Nov 2017
Learning One-hidden-layer Neural Networks with Landscape Design Rong Ge Jason D. Lee Tengyu Ma MLT 117 260 0 01 Nov 2017
Porcupine Neural Networks: (Almost) All Local Optima are Global Soheil Feizi Hamid Javadi Jesse M. Zhang David Tse 62 36 0 05 Oct 2017
Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes Lei Wu Zhanxing Zhu E. Weinan ODL 48 220 0 30 Jun 2017
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks Levent Sagun Utku Evci V. U. Güney Yann N. Dauphin Léon Bottou 39 415 0 14 Jun 2017
Convergence Analysis of Two-layer Neural Networks with ReLU Activation Yuanzhi Li Yang Yuan MLT 80 649 0 28 May 2017
Sharp Minima Can Generalize For Deep Nets Laurent Dinh Razvan Pascanu Samy Bengio Yoshua Bengio ODL 98 763 0 15 Mar 2017
An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis Yuandong Tian MLT 109 216 0 02 Mar 2017
Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs Alon Brutzkus Amir Globerson MLT 92 313 0 26 Feb 2017
A Random Matrix Approach to Neural Networks Cosme Louart Zhenyu Liao Romain Couillet 41 161 0 17 Feb 2017
Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond Levent Sagun Léon Bottou Yann LeCun UQCV 69 233 0 22 Nov 2016
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys Pratik Chaudhari A. Choromańska Stefano Soatto Yann LeCun Carlo Baldassi C. Borgs J. Chayes Levent Sagun R. Zecchina ODL 82 769 0 06 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 353 2,913 0 15 Sep 2016
Distribution-Specific Hardness of Learning Neural Networks Ohad Shamir 52 116 0 05 Sep 2016
Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity Amit Daniely Roy Frostig Y. Singer 85 343 0 18 Feb 2016