Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1901.08244
Cited By
Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians
24 January 2019
Vardan Papyan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians"
26 / 26 papers shown
Title
Non-identifiability distinguishes Neural Networks among Parametric Models
Sourav Chatterjee
Timothy Sudijono
32
0
0
25 Apr 2025
Position: Curvature Matrices Should Be Democratized via Linear Operators
Felix Dangel
Runa Eschenhagen
Weronika Ormaniec
Andres Fernandez
Lukas Tatzel
Agustinus Kristiadi
58
3
0
31 Jan 2025
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes
Nikita Kiselev
Andrey Grabovoy
54
1
0
18 Sep 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
52
0
0
11 Jun 2024
Does SGD really happen in tiny subspaces?
Minhak Song
Kwangjun Ahn
Chulhee Yun
73
5
1
25 May 2024
Architectural Strategies for the optimization of Physics-Informed Neural Networks
Hemanth Saratchandran
Shin-Fang Chng
Simon Lucey
AI4CE
39
0
0
05 Feb 2024
Spectral alignment of stochastic gradient descent for high-dimensional classification tasks
Gerard Ben Arous
Reza Gheissari
Jiaoyang Huang
Aukosh Jagannath
35
14
0
04 Oct 2023
FOSI: Hybrid First and Second Order Optimization
Hadar Sivan
Moshe Gabel
Assaf Schuster
ODL
34
2
0
16 Feb 2023
On the Overlooked Structure of Stochastic Gradients
Zeke Xie
Qian-Yuan Tang
Mingming Sun
P. Li
31
6
0
05 Dec 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
19
44
0
26 Jul 2022
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling
Gerard Ben Arous
Reza Gheissari
Aukosh Jagannath
62
58
0
08 Jun 2022
On the Power-Law Hessian Spectrums in Deep Learning
Zeke Xie
Qian-Yuan Tang
Yunfeng Cai
Mingming Sun
P. Li
ODL
42
9
0
31 Jan 2022
Impact of classification difficulty on the weight matrices spectra in Deep Learning and application to early-stopping
Xuran Meng
Jianfeng Yao
27
7
0
26 Nov 2021
Does the Data Induce Capacity Control in Deep Learning?
Rubing Yang
Jialin Mao
Pratik Chaudhari
35
15
0
27 Oct 2021
Appearance of Random Matrix Theory in Deep Learning
Nicholas P. Baskerville
Diego Granziol
J. Keating
15
11
0
12 Feb 2021
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization
Adepu Ravi Sankar
Yash Khasbage
Rahul Vigneswaran
V. Balasubramanian
25
42
0
07 Dec 2020
Gradient Flow in Sparse Neural Networks and How Lottery Tickets Win
Utku Evci
Yani Andrew Ioannou
Cem Keskin
Yann N. Dauphin
29
87
0
07 Oct 2020
A Framework for Private Matrix Analysis
Jalaj Upadhyay
Sarvagya Upadhyay
26
4
0
06 Sep 2020
Prevalence of Neural Collapse during the terminal phase of deep learning training
Vardan Papyan
Xuemei Han
D. Donoho
35
549
0
18 Aug 2020
Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification
Yingxue Zhou
Zhiwei Steven Wu
A. Banerjee
24
106
0
07 Jul 2020
Directional Pruning of Deep Neural Networks
Shih-Kang Chao
Zhanyu Wang
Yue Xing
Guang Cheng
ODL
21
33
0
16 Jun 2020
The Break-Even Point on Optimization Trajectories of Deep Neural Networks
Stanislaw Jastrzebski
Maciej Szymczak
Stanislav Fort
Devansh Arpit
Jacek Tabor
Kyunghyun Cho
Krzysztof J. Geras
50
154
0
21 Feb 2020
Asymptotics of Wide Networks from Feynman Diagrams
Ethan Dyer
Guy Gur-Ari
29
113
0
25 Sep 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
42
51
0
24 Jul 2019
Stiffness: A New Perspective on Generalization in Neural Networks
Stanislav Fort
Pawel Krzysztof Nowak
Stanislaw Jastrzebski
S. Narayanan
21
94
0
28 Jan 2019
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
1