ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.08244
  4. Cited By
Measurements of Three-Level Hierarchical Structure in the Outliers in
  the Spectrum of Deepnet Hessians

Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians

24 January 2019
Vardan Papyan
ArXivPDFHTML

Papers citing "Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians"

27 / 27 papers shown
Title
Accelerating Neural Network Training Along Sharp and Flat Directions
Accelerating Neural Network Training Along Sharp and Flat Directions
Daniyar Zakarin
Sidak Pal Singh
ODL
2
0
0
17 May 2025
Non-identifiability distinguishes Neural Networks among Parametric Models
Non-identifiability distinguishes Neural Networks among Parametric Models
Sourav Chatterjee
Timothy Sudijono
32
0
0
25 Apr 2025
Position: Curvature Matrices Should Be Democratized via Linear Operators
Position: Curvature Matrices Should Be Democratized via Linear Operators
Felix Dangel
Runa Eschenhagen
Weronika Ormaniec
Andres Fernandez
Lukas Tatzel
Agustinus Kristiadi
58
3
0
31 Jan 2025
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function
  Landscapes
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes
Nikita Kiselev
Andrey Grabovoy
54
1
0
18 Sep 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
52
0
0
11 Jun 2024
Does SGD really happen in tiny subspaces?
Does SGD really happen in tiny subspaces?
Minhak Song
Kwangjun Ahn
Chulhee Yun
73
5
1
25 May 2024
Architectural Strategies for the optimization of Physics-Informed Neural
  Networks
Architectural Strategies for the optimization of Physics-Informed Neural Networks
Hemanth Saratchandran
Shin-Fang Chng
Simon Lucey
AI4CE
39
0
0
05 Feb 2024
Spectral alignment of stochastic gradient descent for high-dimensional classification tasks
Spectral alignment of stochastic gradient descent for high-dimensional classification tasks
Gerard Ben Arous
Reza Gheissari
Jiaoyang Huang
Aukosh Jagannath
35
14
0
04 Oct 2023
FOSI: Hybrid First and Second Order Optimization
FOSI: Hybrid First and Second Order Optimization
Hadar Sivan
Moshe Gabel
Assaf Schuster
ODL
34
2
0
16 Feb 2023
On the Overlooked Structure of Stochastic Gradients
On the Overlooked Structure of Stochastic Gradients
Zeke Xie
Qian-Yuan Tang
Mingming Sun
P. Li
31
6
0
05 Dec 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge
  of Stability
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
19
44
0
26 Jul 2022
High-dimensional limit theorems for SGD: Effective dynamics and critical
  scaling
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling
Gerard Ben Arous
Reza Gheissari
Aukosh Jagannath
62
58
0
08 Jun 2022
On the Power-Law Hessian Spectrums in Deep Learning
On the Power-Law Hessian Spectrums in Deep Learning
Zeke Xie
Qian-Yuan Tang
Yunfeng Cai
Mingming Sun
P. Li
ODL
42
9
0
31 Jan 2022
Impact of classification difficulty on the weight matrices spectra in
  Deep Learning and application to early-stopping
Impact of classification difficulty on the weight matrices spectra in Deep Learning and application to early-stopping
Xuran Meng
Jianfeng Yao
27
7
0
26 Nov 2021
Does the Data Induce Capacity Control in Deep Learning?
Does the Data Induce Capacity Control in Deep Learning?
Rubing Yang
Jialin Mao
Pratik Chaudhari
35
15
0
27 Oct 2021
Appearance of Random Matrix Theory in Deep Learning
Appearance of Random Matrix Theory in Deep Learning
Nicholas P. Baskerville
Diego Granziol
J. Keating
15
11
0
12 Feb 2021
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and
  its Applications to Regularization
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization
Adepu Ravi Sankar
Yash Khasbage
Rahul Vigneswaran
V. Balasubramanian
25
42
0
07 Dec 2020
Gradient Flow in Sparse Neural Networks and How Lottery Tickets Win
Gradient Flow in Sparse Neural Networks and How Lottery Tickets Win
Utku Evci
Yani Andrew Ioannou
Cem Keskin
Yann N. Dauphin
32
87
0
07 Oct 2020
A Framework for Private Matrix Analysis
A Framework for Private Matrix Analysis
Jalaj Upadhyay
Sarvagya Upadhyay
26
4
0
06 Sep 2020
Prevalence of Neural Collapse during the terminal phase of deep learning
  training
Prevalence of Neural Collapse during the terminal phase of deep learning training
Vardan Papyan
Xuemei Han
D. Donoho
35
549
0
18 Aug 2020
Bypassing the Ambient Dimension: Private SGD with Gradient Subspace
  Identification
Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification
Yingxue Zhou
Zhiwei Steven Wu
A. Banerjee
24
106
0
07 Jul 2020
Directional Pruning of Deep Neural Networks
Directional Pruning of Deep Neural Networks
Shih-Kang Chao
Zhanyu Wang
Yue Xing
Guang Cheng
ODL
21
33
0
16 Jun 2020
The Break-Even Point on Optimization Trajectories of Deep Neural
  Networks
The Break-Even Point on Optimization Trajectories of Deep Neural Networks
Stanislaw Jastrzebski
Maciej Szymczak
Stanislav Fort
Devansh Arpit
Jacek Tabor
Kyunghyun Cho
Krzysztof J. Geras
50
154
0
21 Feb 2020
Asymptotics of Wide Networks from Feynman Diagrams
Asymptotics of Wide Networks from Feynman Diagrams
Ethan Dyer
Guy Gur-Ari
29
113
0
25 Sep 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
42
51
0
24 Jul 2019
Stiffness: A New Perspective on Generalization in Neural Networks
Stiffness: A New Perspective on Generalization in Neural Networks
Stanislav Fort
Pawel Krzysztof Nowak
Stanislaw Jastrzebski
S. Narayanan
21
94
0
28 Jan 2019
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
1