Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.07145
Cited By
PyHessian: Neural Networks Through the Lens of the Hessian
16 December 2019
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PyHessian: Neural Networks Through the Lens of the Hessian"
18 / 18 papers shown
Title
Unlocking the Potential of Unlabeled Data in Semi-Supervised Domain Generalization
Dongkwan Lee
Kyomin Hwang
Nojun Kwak
130
0
0
18 Mar 2025
A Hessian-informed hyperparameter optimization for differential learning rate
Shiyun Xu
Zhiqi Bu
Yiliang Zhang
Ian Barnett
75
1
0
12 Jan 2025
Meta Curvature-Aware Minimization for Domain Generalization
Zhaoyu Chen
Yiwen Ye
Feilong Tang
Yongsheng Pan
Yong-quan Xia
BDL
352
1
0
16 Dec 2024
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis
Zhijie Chen
Qiaobo Li
A. Banerjee
FedML
71
0
0
11 Nov 2024
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Jim Zhao
Sidak Pal Singh
Aurelien Lucchi
AI4CE
88
0
0
04 Nov 2024
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
Weronika Ormaniec
Felix Dangel
Sidak Pal Singh
99
7
0
14 Oct 2024
Fast Training of Sinusoidal Neural Fields via Scaling Initialization
Taesun Yeom
Sangyoon Lee
Jaeho Lee
83
3
0
07 Oct 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
102
0
0
11 Jun 2024
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
Zhe Li
Bicheng Ying
Zidong Liu
Chaosheng Dong
Haibo Yang
FedML
112
3
0
24 May 2024
Q-Newton: Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent
Pingzhi Li
Junyu Liu
Hanrui Wang
Tianlong Chen
137
2
0
30 Apr 2024
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
ODL
91
281
0
01 Jun 2020
PowerNorm: Rethinking Batch Normalization in Transformers
Sheng Shen
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
BDL
76
16
0
17 Mar 2020
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
Zhen Dong
Z. Yao
Yaohui Cai
Daiyaan Arfeen
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
77
279
0
10 Nov 2019
Inefficiency of K-FAC for Large Batch Size Training
Linjian Ma
Gabe Montague
Jiayu Ye
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
49
24
0
14 Mar 2019
Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond
Levent Sagun
Léon Bottou
Yann LeCun
UQCV
81
236
0
22 Nov 2016
Sub-sampled Newton Methods with Non-uniform Sampling
Peng Xu
Jiyan Yang
Farbod Roosta-Khorasani
Christopher Ré
Michael W. Mahoney
60
115
0
02 Jul 2016
Second-Order Stochastic Optimization for Machine Learning in Linear Time
Naman Agarwal
Brian Bullins
Elad Hazan
ODL
46
102
0
12 Feb 2016
Sub-Sampled Newton Methods I: Globally Convergent Algorithms
Farbod Roosta-Khorasani
Michael W. Mahoney
61
89
0
18 Jan 2016
1