Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians

24 January 2019

Papers citing "Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians"

27 / 27 papers shown

Title
Accelerating Neural Network Training Along Sharp and Flat Directions Daniyar Zakarin Sidak Pal Singh ODL 2 0 0 17 May 2025
Non-identifiability distinguishes Neural Networks among Parametric Models Sourav Chatterjee Timothy Sudijono 32 0 0 25 Apr 2025
Position: Curvature Matrices Should Be Democratized via Linear Operators Felix Dangel Runa Eschenhagen Weronika Ormaniec Andres Fernandez Lukas Tatzel Agustinus Kristiadi 58 3 0 31 Jan 2025
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes Nikita Kiselev Andrey Grabovoy 54 1 0 18 Sep 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees A. Banerjee Qiaobo Li Yingxue Zhou 52 0 0 11 Jun 2024
Does SGD really happen in tiny subspaces? Minhak Song Kwangjun Ahn Chulhee Yun 73 5 1 25 May 2024
Architectural Strategies for the optimization of Physics-Informed Neural Networks Hemanth Saratchandran Shin-Fang Chng Simon Lucey AI4CE 39 0 0 05 Feb 2024
Spectral alignment of stochastic gradient descent for high-dimensional classification tasks Gerard Ben Arous Reza Gheissari Jiaoyang Huang Aukosh Jagannath 35 14 0 04 Oct 2023
FOSI: Hybrid First and Second Order Optimization Hadar Sivan Moshe Gabel Assaf Schuster ODL 34 2 0 16 Feb 2023
On the Overlooked Structure of Stochastic Gradients Zeke Xie Qian-Yuan Tang Mingming Sun P. Li 31 6 0 05 Dec 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability Z. Li Zixuan Wang Jian Li 19 44 0 26 Jul 2022
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling Gerard Ben Arous Reza Gheissari Aukosh Jagannath 62 58 0 08 Jun 2022
On the Power-Law Hessian Spectrums in Deep Learning Zeke Xie Qian-Yuan Tang Yunfeng Cai Mingming Sun P. Li ODL 42 9 0 31 Jan 2022
Impact of classification difficulty on the weight matrices spectra in Deep Learning and application to early-stopping Xuran Meng Jianfeng Yao 27 7 0 26 Nov 2021
Does the Data Induce Capacity Control in Deep Learning? Rubing Yang Jialin Mao Pratik Chaudhari 35 15 0 27 Oct 2021
Appearance of Random Matrix Theory in Deep Learning Nicholas P. Baskerville Diego Granziol J. Keating 15 11 0 12 Feb 2021
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization Adepu Ravi Sankar Yash Khasbage Rahul Vigneswaran V. Balasubramanian 25 42 0 07 Dec 2020
Gradient Flow in Sparse Neural Networks and How Lottery Tickets Win Utku Evci Yani Andrew Ioannou Cem Keskin Yann N. Dauphin 32 87 0 07 Oct 2020
A Framework for Private Matrix Analysis Jalaj Upadhyay Sarvagya Upadhyay 26 4 0 06 Sep 2020
Prevalence of Neural Collapse during the terminal phase of deep learning training Vardan Papyan Xuemei Han D. Donoho 35 549 0 18 Aug 2020
Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification Yingxue Zhou Zhiwei Steven Wu A. Banerjee 24 106 0 07 Jul 2020
Directional Pruning of Deep Neural Networks Shih-Kang Chao Zhanyu Wang Yue Xing Guang Cheng ODL 21 33 0 16 Jun 2020
The Break-Even Point on Optimization Trajectories of Deep Neural Networks Stanislaw Jastrzebski Maciej Szymczak Stanislav Fort Devansh Arpit Jacek Tabor Kyunghyun Cho Krzysztof J. Geras 50 154 0 21 Feb 2020
Asymptotics of Wide Networks from Feynman Diagrams Ethan Dyer Guy Gur-Ari 29 113 0 25 Sep 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization Xinyan Li Qilong Gu Yingxue Zhou Tiancong Chen A. Banerjee ODL 42 51 0 24 Jul 2019
Stiffness: A New Perspective on Generalization in Neural Networks Stanislav Fort Pawel Krzysztof Nowak Stanislaw Jastrzebski S. Narayanan 21 94 0 28 Jan 2019
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 308 2,890 0 15 Sep 2016