Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1503.05671
Cited By
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
19 March 2015
James Martens
Roger C. Grosse
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Optimizing Neural Networks with Kronecker-factored Approximate Curvature"
50 / 232 papers shown
Title
A linearized framework and a new benchmark for model selection for fine-tuning
Aditya Deshpande
Alessandro Achille
Avinash Ravichandran
Hao Li
L. Zancato
Charless C. Fowlkes
Rahul Bhotika
Stefano Soatto
Pietro Perona
ALM
118
46
0
29 Jan 2021
LQF: Linear Quadratic Fine-Tuning
Alessandro Achille
Aditya Golatkar
Avinash Ravichandran
M. Polito
Stefano Soatto
29
27
0
21 Dec 2020
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization
Adepu Ravi Sankar
Yash Khasbage
Rahul Vigneswaran
V. Balasubramanian
25
42
0
07 Dec 2020
A Trace-restricted Kronecker-Factored Approximation to Natural Gradient
Kai-Xin Gao
Xiaolei Liu
Zheng-Hai Huang
Min Wang
Zidong Wang
Dachuan Xu
F. Yu
24
11
0
21 Nov 2020
A Random Matrix Theory Approach to Damping in Deep Learning
Diego Granziol
Nicholas P. Baskerville
AI4CE
ODL
29
2
0
15 Nov 2020
Reverse engineering learned optimizers reveals known and novel mechanisms
Niru Maheswaranathan
David Sussillo
Luke Metz
Ruoxi Sun
Jascha Narain Sohl-Dickstein
22
21
0
04 Nov 2020
Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians
Juhan Bae
Roger C. Grosse
27
24
0
26 Oct 2020
Sharpness-Aware Minimization for Efficiently Improving Generalization
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
119
1,278
0
03 Oct 2020
A straightforward line search approach on the expected empirical loss for stochastic deep learning problems
Max Mutschler
A. Zell
36
0
0
02 Oct 2020
VacSIM: Learning Effective Strategies for COVID-19 Vaccine Distribution using Reinforcement Learning
R. Awasthi
K. K. Guliani
Saif Ahmad Khan
Aniket Vashishtha
M. S. Gill
Arshita Bhatt
A. Nagori
Aniket Gupta
Ponnurangam Kumaraguru
Tavpritesh Sethi
34
24
0
14 Sep 2020
Transform Quantization for CNN (Convolutional Neural Network) Compression
Sean I. Young
Wang Zhe
David S. Taubman
B. Girod
MQ
29
69
0
02 Sep 2020
Optimization of Graph Neural Networks with Natural Gradient Descent
M. Izadi
Yihao Fang
R. Stevenson
Lizhen Lin
GNN
24
41
0
21 Aug 2020
Whitening and second order optimization both make information in the dataset unusable during training, and can reduce or prevent generalization
Neha S. Wadia
Daniel Duckworth
S. Schoenholz
Ethan Dyer
Jascha Narain Sohl-Dickstein
27
13
0
17 Aug 2020
Tighter risk certificates for neural networks
Maria Perez-Ortiz
Omar Rivasplata
John Shawe-Taylor
Csaba Szepesvári
UQCV
20
102
0
25 Jul 2020
A Differential Game Theoretic Neural Optimizer for Training Residual Networks
Guan-Horng Liu
T. Chen
Evangelos A. Theodorou
24
2
0
17 Jul 2020
A General Family of Stochastic Proximal Gradient Methods for Deep Learning
Jihun Yun
A. Lozano
Eunho Yang
20
12
0
15 Jul 2020
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
Robin M. Schmidt
Frank Schneider
Philipp Hennig
ODL
40
162
0
03 Jul 2020
Revisiting Loss Modelling for Unstructured Pruning
César Laurent
Camille Ballas
Thomas George
Nicolas Ballas
Pascal Vincent
30
14
0
22 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time
Jan van den Brand
Binghui Peng
Zhao Song
Omri Weinstein
ODL
29
82
0
20 Jun 2020
Estimating Model Uncertainty of Neural Networks in Sparse Information Form
Jongseo Lee
Matthias Humt
Jianxiang Feng
Rudolph Triebel
BDL
UQCV
38
46
0
20 Jun 2020
When Does Preconditioning Help or Hurt Generalization?
S. Amari
Jimmy Ba
Roger C. Grosse
Xuechen Li
Atsushi Nitanda
Taiji Suzuki
Denny Wu
Ji Xu
36
32
0
18 Jun 2020
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training
Diego Granziol
S. Zohren
Stephen J. Roberts
ODL
37
49
0
16 Jun 2020
Practical Quasi-Newton Methods for Training Deep Neural Networks
D. Goldfarb
Yi Ren
Achraf Bahamou
ODL
8
104
0
16 Jun 2020
On the Promise of the Stochastic Generalized Gauss-Newton Method for Training DNNs
Matilde Gargiani
Andrea Zanelli
Moritz Diehl
Frank Hutter
ODL
4
18
0
03 Jun 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
ODL
39
275
0
01 Jun 2020
Continual Learning with Extended Kronecker-factored Approximate Curvature
Janghyeon Lee
H. Hong
Donggyu Joo
Junmo Kim
CLL
20
52
0
16 Apr 2020
RelatIF: Identifying Explanatory Training Examples via Relative Influence
Elnaz Barshan
Marc-Etienne Brunet
Gintare Karolina Dziugaite
TDI
47
30
0
25 Mar 2020
What Deep CNNs Benefit from Global Covariance Pooling: An Optimization Perspective
Qilong Wang
Li Zhang
Banggu Wu
Dongwei Ren
P. Li
W. Zuo
Q. Hu
22
21
0
25 Mar 2020
Critical Point-Finding Methods Reveal Gradient-Flat Regions of Deep Network Losses
Charles G. Frye
James B. Simon
Neha S. Wadia
A. Ligeralde
M. DeWeese
K. Bouchard
ODL
16
2
0
23 Mar 2020
Fast Predictive Uncertainty for Classification with Bayesian Deep Networks
Marius Hobbhahn
Agustinus Kristiadi
Philipp Hennig
BDL
UQCV
76
31
0
02 Mar 2020
The Two Regimes of Deep Network Training
Guillaume Leclerc
A. Madry
19
45
0
24 Feb 2020
Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks
Agustinus Kristiadi
Matthias Hein
Philipp Hennig
BDL
UQCV
33
277
0
24 Feb 2020
Scalable Second Order Optimization for Deep Learning
Rohan Anil
Vineet Gupta
Tomer Koren
Kevin Regan
Y. Singer
ODL
19
29
0
20 Feb 2020
Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning
Arsenii Ashukha
Alexander Lyzhov
Dmitry Molchanov
Dmitry Vetrov
UQCV
FedML
33
314
0
15 Feb 2020
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Xuan Li
ODL
200
57
0
09 Feb 2020
Information Newton's flow: second-order optimization method in probability space
Yifei Wang
Wuchen Li
26
31
0
13 Jan 2020
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
25
168
0
19 Dec 2019
Stein Variational Gradient Descent With Matrix-Valued Kernels
Dilin Wang
Ziyang Tang
Chandrajit L. Bajaj
Qiang Liu
25
62
0
28 Oct 2019
Kernelized Wasserstein Natural Gradient
Michael Arbel
Arthur Gretton
Wuchen Li
Guido Montúfar
18
22
0
21 Oct 2019
Lookahead Optimizer: k steps forward, 1 step back
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
ODL
54
719
0
19 Jul 2019
Modern Deep Reinforcement Learning Algorithms
Sergey Ivanov
A. Dýakonov
OffRL
23
38
0
24 Jun 2019
On the Noisy Gradient Descent that Generalizes as SGD
Jingfeng Wu
Wenqing Hu
Haoyi Xiong
Jun Huan
Vladimir Braverman
Zhanxing Zhu
MLT
24
10
0
18 Jun 2019
Non-Parametric Calibration for Classification
Jonathan Wenger
Hedvig Kjellström
Rudolph Triebel
UQCV
45
79
0
12 Jun 2019
The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks
Ryo Karakida
S. Akaho
S. Amari
27
39
0
07 Jun 2019
Matrix-Free Preconditioning in Online Learning
Ashok Cutkosky
Tamás Sarlós
ODL
30
16
0
29 May 2019
Limitations of the Empirical Fisher Approximation for Natural Gradient Descent
Frederik Kunstner
Lukas Balles
Philipp Hennig
21
209
0
29 May 2019
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems
Tianle Cai
Ruiqi Gao
Jikai Hou
Siyu Chen
Dong Wang
Di He
Zhihua Zhang
Liwei Wang
ODL
21
57
0
28 May 2019
LAGC: Lazily Aggregated Gradient Coding for Straggler-Tolerant and Communication-Efficient Distributed Learning
Jingjing Zhang
Osvaldo Simeone
18
31
0
22 May 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
28
980
0
01 Apr 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise
Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
ODL
20
22
0
21 Feb 2019
Previous
1
2
3
4
5
Next