Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1901.10159
Cited By
An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
29 January 2019
Behrooz Ghorbani
Shankar Krishnan
Ying Xiao
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"An Investigation into Neural Net Optimization via Hessian Eigenvalue Density"
40 / 40 papers shown
Title
KerZOO: Kernel Function Informed Zeroth-Order Optimization for Accurate and Accelerated LLM Fine-Tuning
Zhendong Mi
Qitao Tan
Xiaodong Yu
Zining Zhu
Geng Yuan
Shaoyi Huang
200
0
0
24 May 2025
Accelerating Neural Network Training Along Sharp and Flat Directions
Daniyar Zakarin
Sidak Pal Singh
ODL
65
0
0
17 May 2025
DGSAM: Domain Generalization via Individual Sharpness-Aware Minimization
Youngjun Song
Youngsik Hwang
Jonghun Lee
Heechang Lee
Dong-Young Lim
AAML
108
0
0
30 Mar 2025
A Hessian-informed hyperparameter optimization for differential learning rate
Shiyun Xu
Zhiqi Bu
Yiliang Zhang
Ian Barnett
99
1
0
12 Jan 2025
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis
Zhijie Chen
Qiaobo Li
A. Banerjee
FedML
90
0
0
11 Nov 2024
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Jim Zhao
Sidak Pal Singh
Aurelien Lucchi
AI4CE
116
0
0
04 Nov 2024
Debiasing Mini-Batch Quadratics for Applications in Deep Learning
Lukas Tatzel
Bálint Mucsányi
Osane Hackel
Philipp Hennig
86
0
0
18 Oct 2024
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
Weronika Ormaniec
Felix Dangel
Sidak Pal Singh
113
7
0
14 Oct 2024
Fast Training of Sinusoidal Neural Fields via Scaling Initialization
Taesun Yeom
Sangyoon Lee
Jaeho Lee
96
3
0
07 Oct 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
125
0
0
11 Jun 2024
Does SGD really happen in tiny subspaces?
Minhak Song
Kwangjun Ahn
Chulhee Yun
100
6
1
25 May 2024
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
Zhe Li
Bicheng Ying
Zidong Liu
Chaosheng Dong
Haibo Yang
FedML
118
3
0
24 May 2024
High dimensional analysis reveals conservative sharpening and a stochastic edge of stability
Atish Agarwala
Jeffrey Pennington
91
4
0
30 Apr 2024
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
Yanjun Zhao
Sizhe Dang
Haishan Ye
Guang Dai
Yi Qian
Ivor W.Tsang
118
13
0
23 Feb 2024
Generalisation under gradient descent via deterministic PAC-Bayes
Eugenio Clerico
Tyler Farghly
George Deligiannidis
Benjamin Guedj
Arnaud Doucet
92
4
0
06 Sep 2022
MIO : Mutual Information Optimization using Self-Supervised Binary Contrastive Learning
Siladittya Manna
Umapada Pal
Saumik Bhattacharya
SSL
102
1
0
24 Nov 2021
Beyond Random Matrix Theory for Deep Networks
Diego Granziol
105
16
0
13 Jun 2020
Concentration of quadratic forms under a Bernstein moment assumption
Pierre C. Bellec
90
21
0
25 Jan 2019
Gradient Descent Happens in a Tiny Subspace
Guy Gur-Ari
Daniel A. Roberts
Ethan Dyer
100
233
0
12 Dec 2018
The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD Training and Sample Size
Vardan Papyan
59
31
0
16 Nov 2018
The Goldilocks zone: Towards better understanding of neural network loss landscapes
Stanislav Fort
Adam Scherlis
66
50
0
06 Jul 2018
How Does Batch Normalization Help Optimization?
Shibani Santurkar
Dimitris Tsipras
Andrew Ilyas
Aleksander Madry
ODL
103
1,544
0
29 May 2018
Measuring the Intrinsic Dimension of Objective Landscapes
Chunyuan Li
Heerad Farkhoor
Rosanne Liu
J. Yosinski
86
414
0
24 Apr 2018
Essentially No Barriers in Neural Network Energy Landscape
Felix Dräxler
K. Veschgini
M. Salmhofer
Fred Hamprecht
MoMe
116
434
0
02 Mar 2018
Hessian-based Analysis of Large Batch Training and Robustness to Adversaries
Z. Yao
A. Gholami
Qi Lei
Kurt Keutzer
Michael W. Mahoney
69
167
0
22 Feb 2018
An Alternative View: When Does SGD Escape Local Minima?
Robert D. Kleinberg
Yuanzhi Li
Yang Yuan
MLT
77
317
0
17 Feb 2018
Estimating the Spectral Density of Large Implicit Matrices
Ryan P. Adams
Jeffrey Pennington
Matthew J. Johnson
Jamie Smith
Yaniv Ovadia
Brian Patton
J. Saunderson
74
34
0
09 Feb 2018
Visualizing the Loss Landscape of Neural Nets
Hao Li
Zheng Xu
Gavin Taylor
Christoph Studer
Tom Goldstein
256
1,896
0
28 Dec 2017
Three Factors Influencing Minima in SGD
Stanislaw Jastrzebski
Zachary Kenton
Devansh Arpit
Nicolas Ballas
Asja Fischer
Yoshua Bengio
Amos Storkey
76
463
0
13 Nov 2017
Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes
Lei Wu
Zhanxing Zhu
E. Weinan
ODL
64
221
0
30 Jun 2017
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks
Levent Sagun
Utku Evci
V. U. Güney
Yann N. Dauphin
Léon Bottou
54
418
0
14 Jun 2017
Sharp Minima Can Generalize For Deep Nets
Laurent Dinh
Razvan Pascanu
Samy Bengio
Yoshua Bengio
ODL
125
774
0
15 Mar 2017
Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond
Levent Sagun
Léon Bottou
Yann LeCun
UQCV
91
236
0
22 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
427
2,945
0
15 Sep 2016
TensorFlow: A system for large-scale machine learning
Martín Abadi
P. Barham
Jianmin Chen
Zhiwen Chen
Andy Davis
...
Vijay Vasudevan
Pete Warden
Martin Wicke
Yuan Yu
Xiaoqiang Zhang
GNN
AI4CE
433
18,361
0
27 May 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
194,322
0
10 Dec 2015
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DV
BDL
886
27,412
0
02 Dec 2015
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
463
43,328
0
11 Feb 2015
Qualitatively characterizing neural network optimization problems
Ian Goodfellow
Oriol Vinyals
Andrew M. Saxe
ODL
110
523
0
19 Dec 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
1.7K
100,479
0
04 Sep 2014
1