ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.10732
  4. Cited By
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization

Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization

24 July 2019
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
    ODL
ArXivPDFHTML

Papers citing "Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization"

38 / 38 papers shown
Title
Towards Quantifying the Hessian Structure of Neural Networks
Towards Quantifying the Hessian Structure of Neural Networks
Zhaorui Dong
Yushun Zhang
Zhi-Quan Luo
Jianfeng Yao
Ruoyu Sun
31
0
0
05 May 2025
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis
Zhijie Chen
Qiaobo Li
A. Banerjee
FedML
37
0
0
11 Nov 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
49
0
0
11 Jun 2024
Unifying Low Dimensional Observations in Deep Learning Through the Deep
  Linear Unconstrained Feature Model
Unifying Low Dimensional Observations in Deep Learning Through the Deep Linear Unconstrained Feature Model
Connall Garrod
Jonathan P. Keating
41
8
0
09 Apr 2024
On Differentially Private Subspace Estimation in a Distribution-Free
  Setting
On Differentially Private Subspace Estimation in a Distribution-Free Setting
Eliad Tsfadia
25
1
0
09 Feb 2024
PCDP-SGD: Improving the Convergence of Differentially Private SGD via Projection in Advance
PCDP-SGD: Improving the Convergence of Differentially Private SGD via Projection in Advance
Haichao Sha
Ruixuan Liu
Yi-xiao Liu
Hong Chen
52
1
0
06 Dec 2023
Outliers with Opposing Signals Have an Outsized Effect on Neural Network
  Optimization
Outliers with Opposing Signals Have an Outsized Effect on Neural Network Optimization
Elan Rosenfeld
Andrej Risteski
25
10
0
07 Nov 2023
Spectral alignment of stochastic gradient descent for high-dimensional classification tasks
Spectral alignment of stochastic gradient descent for high-dimensional classification tasks
Gerard Ben Arous
Reza Gheissari
Jiaoyang Huang
Aukosh Jagannath
32
14
0
04 Oct 2023
Taxonomy Adaptive Cross-Domain Adaptation in Medical Imaging via
  Optimization Trajectory Distillation
Taxonomy Adaptive Cross-Domain Adaptation in Medical Imaging via Optimization Trajectory Distillation
Jianan Fan
Dongnan Liu
Hang Chang
Heng-Chiao Huang
Mei Chen
Weidong (Tom) Cai
OOD
27
9
0
27 Jul 2023
Correlated Noise in Epoch-Based Stochastic Gradient Descent:
  Implications for Weight Variances
Correlated Noise in Epoch-Based Stochastic Gradient Descent: Implications for Weight Variances
Marcel Kühn
B. Rosenow
19
3
0
08 Jun 2023
Surrogate Model Extension (SME): A Fast and Accurate Weight Update
  Attack on Federated Learning
Surrogate Model Extension (SME): A Fast and Accurate Weight Update Attack on Federated Learning
Junyi Zhu
Ruicong Yao
Matthew B. Blaschko
FedML
8
9
0
31 May 2023
Choosing Public Datasets for Private Machine Learning via Gradient
  Subspace Distance
Choosing Public Datasets for Private Machine Learning via Gradient Subspace Distance
Xin Gu
Gautam Kamath
Zhiwei Steven Wu
28
12
0
02 Mar 2023
Escaping Saddle Points for Effective Generalization on Class-Imbalanced
  Data
Escaping Saddle Points for Effective Generalization on Class-Imbalanced Data
Harsh Rangwani
Sumukh K Aithal
Mayank Mishra
R. Venkatesh Babu
31
28
0
28 Dec 2022
On the Overlooked Structure of Stochastic Gradients
On the Overlooked Structure of Stochastic Gradients
Zeke Xie
Qian-Yuan Tang
Mingming Sun
P. Li
31
6
0
05 Dec 2022
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of
  SGD via Training Trajectories and via Terminal States
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States
Ziqiao Wang
Yongyi Mao
30
10
0
19 Nov 2022
Noise Injection as a Probe of Deep Learning Dynamics
Noise Injection as a Probe of Deep Learning Dynamics
Noam Levi
I. Bloch
M. Freytsis
T. Volansky
40
2
0
24 Oct 2022
FIT: A Metric for Model Sensitivity
FIT: A Metric for Model Sensitivity
Ben Zandonati
Adrian Alan Pol
M. Pierini
Olya Sirkin
Tal Kopetz
MQ
24
8
0
16 Oct 2022
Analysis of Branch Specialization and its Application in Image
  Decomposition
Analysis of Branch Specialization and its Application in Image Decomposition
Jonathan Brokman
Guy Gilboa
12
2
0
12 Jun 2022
On the Power-Law Hessian Spectrums in Deep Learning
On the Power-Law Hessian Spectrums in Deep Learning
Zeke Xie
Qian-Yuan Tang
Yunfeng Cai
Mingming Sun
P. Li
ODL
42
9
0
31 Jan 2022
Improving Differentially Private SGD via Randomly Sparsified Gradients
Improving Differentially Private SGD via Randomly Sparsified Gradients
Junyi Zhu
Matthew B. Blaschko
26
5
0
01 Dec 2021
Fishr: Invariant Gradient Variances for Out-of-Distribution
  Generalization
Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization
Alexandre Ramé
Corentin Dancette
Matthieu Cord
OOD
40
204
0
07 Sep 2021
Shift-Curvature, SGD, and Generalization
Shift-Curvature, SGD, and Generalization
Arwen V. Bradley
C. Gomez-Uribe
Manish Reddy Vuyyuru
35
2
0
21 Aug 2021
Large Scale Private Learning via Low-rank Reparametrization
Large Scale Private Learning via Low-rank Reparametrization
Da Yu
Huishuai Zhang
Wei Chen
Jian Yin
Tie-Yan Liu
23
100
0
17 Jun 2021
Vanishing Curvature and the Power of Adaptive Methods in Randomly
  Initialized Deep Networks
Vanishing Curvature and the Power of Adaptive Methods in Randomly Initialized Deep Networks
Antonio Orvieto
Jonas Köhler
Dario Pavllo
Thomas Hofmann
Aurelien Lucchi
ODL
25
5
0
07 Jun 2021
Privately Learning Subspaces
Privately Learning Subspaces
Vikrant Singhal
Thomas Steinke
21
20
0
28 May 2021
Empirically explaining SGD from a line search perspective
Empirically explaining SGD from a line search perspective
Max Mutschler
A. Zell
ODL
LRM
18
4
0
31 Mar 2021
Hessian Eigenspectra of More Realistic Nonlinear Models
Hessian Eigenspectra of More Realistic Nonlinear Models
Zhenyu Liao
Michael W. Mahoney
22
29
0
02 Mar 2021
Gradient Descent on Neural Networks Typically Occurs at the Edge of
  Stability
Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability
Jeremy M. Cohen
Simran Kaur
Yuanzhi Li
J. Zico Kolter
Ameet Talwalkar
ODL
34
247
0
26 Feb 2021
Do Not Let Privacy Overbill Utility: Gradient Embedding Perturbation for
  Private Learning
Do Not Let Privacy Overbill Utility: Gradient Embedding Perturbation for Private Learning
Da Yu
Huishuai Zhang
Wei Chen
Tie-Yan Liu
FedML
SILM
94
110
0
25 Feb 2021
Provable Super-Convergence with a Large Cyclical Learning Rate
Provable Super-Convergence with a Large Cyclical Learning Rate
Samet Oymak
33
12
0
22 Feb 2021
Dissecting Hessian: Understanding Common Structure of Hessian in Neural
  Networks
Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks
Yikai Wu
Xingyu Zhu
Chenwei Wu
Annie Wang
Rong Ge
24
42
0
08 Oct 2020
A Framework for Private Matrix Analysis
A Framework for Private Matrix Analysis
Jalaj Upadhyay
Sarvagya Upadhyay
24
4
0
06 Sep 2020
Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra
Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra
Vardan Papyan
14
76
0
27 Aug 2020
Bypassing the Ambient Dimension: Private SGD with Gradient Subspace
  Identification
Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification
Yingxue Zhou
Zhiwei Steven Wu
A. Banerjee
24
106
0
07 Jul 2020
De-randomized PAC-Bayes Margin Bounds: Applications to Non-convex and
  Non-smooth Predictors
De-randomized PAC-Bayes Margin Bounds: Applications to Non-convex and Non-smooth Predictors
A. Banerjee
Tiancong Chen
Yingxue Zhou
BDL
16
8
0
23 Feb 2020
The Geometry of Sign Gradient Descent
The Geometry of Sign Gradient Descent
Lukas Balles
Fabian Pedregosa
Nicolas Le Roux
ODL
18
22
0
19 Feb 2020
Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field
  Approximation
Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation
Konstantinos Pitas
13
8
0
06 Sep 2019
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
1