Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1503.05671
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
19 March 2015
James Martens
Roger C. Grosse
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimizing Neural Networks with Kronecker-factored Approximate Curvature"
50 / 645 papers shown
Title
A General Family of Stochastic Proximal Gradient Methods for Deep Learning
Jihun Yun
A. Lozano
Eunho Yang
71
13
0
15 Jul 2020
A Study of Gradient Variance in Deep Learning
Fartash Faghri
David Duvenaud
David J. Fleet
Jimmy Ba
FedML
ODL
59
27
0
09 Jul 2020
Meta-Learning Symmetries by Reparameterization
Allan Zhou
Tom Knowles
Chelsea Finn
OOD
91
96
0
06 Jul 2020
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
Robin M. Schmidt
Frank Schneider
Philipp Hennig
ODL
225
169
0
03 Jul 2020
Convolutional Neural Network Training with Distributed K-FAC
J. G. Pauloski
Zhao Zhang
Lei Huang
Weijia Xu
Ian Foster
59
31
0
01 Jul 2020
Continual Learning: Tackling Catastrophic Forgetting in Deep Neural Networks with Replay Processes
Timothée Lesort
CLL
85
22
0
01 Jul 2020
A Theoretical Framework for Target Propagation
Alexander Meulemans
Francesco S. Carzaniga
Johan A. K. Suykens
João Sacramento
Benjamin Grewe
AAML
110
79
0
25 Jun 2020
Revisiting Loss Modelling for Unstructured Pruning
César Laurent
Camille Ballas
Thomas George
Nicolas Ballas
Pascal Vincent
68
14
0
22 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time
Jan van den Brand
Binghui Peng
Zhao Song
Omri Weinstein
ODL
91
83
0
20 Jun 2020
Estimating Model Uncertainty of Neural Networks in Sparse Information Form
Jongseo Lee
Matthias Humt
Jianxiang Feng
Rudolph Triebel
BDL
UQCV
103
47
0
20 Jun 2020
Enhance Curvature Information by Structured Stochastic Quasi-Newton Methods
Minghan Yang
Dong Xu
Yongfeng Li
Zaiwen Wen
Mengyun Chen
ODL
47
3
0
17 Jun 2020
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training
Diego Granziol
S. Zohren
Stephen J. Roberts
ODL
148
50
0
16 Jun 2020
Practical Quasi-Newton Methods for Training Deep Neural Networks
Shiqian Ma
Yi Ren
Achraf Bahamou
ODL
115
106
0
16 Jun 2020
The Limit of the Batch Size
Yang You
Yuhui Wang
Huan Zhang
Zhao-jie Zhang
J. Demmel
Cho-Jui Hsieh
121
15
0
15 Jun 2020
Optimization Theory for ReLU Neural Networks Trained with Normalization Layers
Yonatan Dukler
Quanquan Gu
Guido Montúfar
83
30
0
11 Jun 2020
Sketchy Empirical Natural Gradient Methods for Deep Learning
Minghan Yang
Dong Xu
Zaiwen Wen
Mengyun Chen
Pengxiang Xu
46
13
0
10 Jun 2020
On the Promise of the Stochastic Generalized Gauss-Newton Method for Training DNNs
Matilde Gargiani
Andrea Zanelli
Moritz Diehl
Frank Hutter
ODL
78
18
0
03 Jun 2020
Encoding formulas as deep networks: Reinforcement learning for zero-shot execution of LTL formulas
Yen-Ling Kuo
Boris Katz
Andrei Barbu
78
41
0
01 Jun 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
ODL
160
287
0
01 Jun 2020
Beyond the Mean-Field: Structured Deep Gaussian Processes Improve the Predictive Uncertainties
J. Lindinger
David Reeb
C. Lippert
Barbara Rakitsch
BDL
UQCV
74
8
0
22 May 2020
On the Locality of the Natural Gradient for Deep Learning
Nihat Ay
16
0
0
21 May 2020
Addressing Catastrophic Forgetting in Few-Shot Problems
Pauching Yap
H. Ritter
David Barber
CLL
BDL
85
19
0
30 Apr 2020
WoodFisher: Efficient Second-Order Approximation for Neural Network Compression
Sidak Pal Singh
Dan Alistarh
70
28
0
29 Apr 2020
Continual Learning with Extended Kronecker-factored Approximate Curvature
Janghyeon Lee
H. Hong
Donggyu Joo
Junmo Kim
CLL
122
56
0
16 Apr 2020
Deep Neural Network Learning with Second-Order Optimizers -- a Practical Study with a Stochastic Quasi-Gauss-Newton Method
C. Thiele
Mauricio Araya-Polo
D. Hohl
ODL
35
2
0
06 Apr 2020
RelatIF: Identifying Explanatory Training Examples via Relative Influence
Elnaz Barshan
Marc-Etienne Brunet
Gintare Karolina Dziugaite
TDI
141
30
0
25 Mar 2020
What Deep CNNs Benefit from Global Covariance Pooling: An Optimization Perspective
Qilong Wang
Li Zhang
Banggu Wu
Dongwei Ren
P. Li
W. Zuo
Q. Hu
81
21
0
25 Mar 2020
Critical Point-Finding Methods Reveal Gradient-Flat Regions of Deep Network Losses
Charles G. Frye
James B. Simon
Neha S. Wadia
A. Ligeralde
M. DeWeese
K. Bouchard
ODL
55
2
0
23 Mar 2020
Communication-Efficient Distributed Deep Learning: A Comprehensive Survey
Zhenheng Tang
Shaoshuai Shi
Wei Wang
Yue Liu
Xiaowen Chu
83
49
0
10 Mar 2020
Fast Predictive Uncertainty for Classification with Bayesian Deep Networks
Marius Hobbhahn
Agustinus Kristiadi
Philipp Hennig
BDL
UQCV
177
34
0
02 Mar 2020
Disentangling Adaptive Gradient Methods from Learning Rates
Naman Agarwal
Rohan Anil
Elad Hazan
Tomer Koren
Cyril Zhang
109
38
0
26 Feb 2020
Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs
Lei Huang
Jie Qin
Li Liu
Fan Zhu
Ling Shao
AI4CE
86
11
0
25 Feb 2020
The Two Regimes of Deep Network Training
Guillaume Leclerc
Aleksander Madry
94
45
0
24 Feb 2020
Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks
Agustinus Kristiadi
Matthias Hein
Philipp Hennig
BDL
UQCV
90
290
0
24 Feb 2020
Scalable Second Order Optimization for Deep Learning
Rohan Anil
Vineet Gupta
Tomer Koren
Kevin Regan
Y. Singer
ODL
59
30
0
20 Feb 2020
DDPNOpt: Differential Dynamic Programming Neural Optimizer
Guan-Horng Liu
T. Chen
Evangelos A. Theodorou
88
7
0
20 Feb 2020
Towards Query-Efficient Black-Box Adversary with Zeroth-Order Natural Gradient Descent
Pu Zhao
Pin-Yu Chen
Siyue Wang
Xinyu Lin
AAML
73
36
0
18 Feb 2020
Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning
Arsenii Ashukha
Alexander Lyzhov
Dmitry Molchanov
Dmitry Vetrov
UQCV
FedML
114
320
0
15 Feb 2020
Scalable and Practical Natural Gradient for Large-Scale Deep Learning
Kazuki Osawa
Yohei Tsuji
Yuichiro Ueno
Akira Naruse
Chuan-Sheng Foo
Rio Yokota
85
37
0
13 Feb 2020
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Xuan Li
ODL
281
59
0
09 Feb 2020
On the Convex Behavior of Deep Neural Networks in Relation to the Layers' Width
Etai Littwin
Lior Wolf
ODL
42
3
0
14 Jan 2020
Information Newton's flow: second-order optimization method in probability space
Yifei Wang
Wuchen Li
111
31
0
13 Jan 2020
A Dynamic Sampling Adaptive-SGD Method for Machine Learning
Achraf Bahamou
Shiqian Ma
ODL
39
2
0
31 Dec 2019
BackPACK: Packing more into backprop
Felix Dangel
Frederik Kunstner
Philipp Hennig
ODL
111
103
0
23 Dec 2019
Second-order Information in First-order Optimization Methods
Yuzheng Hu
Licong Lin
Shange Tang
ODL
53
2
0
20 Dec 2019
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
137
169
0
19 Dec 2019
Tangent Space Separability in Feedforward Neural Networks
Balint Daroczy
Rita Aleksziev
András A. Benczúr
50
3
0
18 Dec 2019
PyHessian: Neural Networks Through the Lens of the Hessian
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
ODL
89
305
0
16 Dec 2019
Regularization Shortcomings for Continual Learning
Timothée Lesort
Andrei Stoian
David Filliat
CLL
85
50
0
06 Dec 2019
Biologically inspired architectures for sample-efficient deep reinforcement learning
Pierre Harvey Richemond
Arinbjorn Kolbeinsson
Yike Guo
59
2
0
25 Nov 2019
Previous
1
2
3
...
10
11
12
13
9
Next