Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1503.05671
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
19 March 2015
James Martens
Roger C. Grosse
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimizing Neural Networks with Kronecker-factored Approximate Curvature"
50 / 645 papers shown
Title
A Sub-sampled Tensor Method for Non-convex Optimization
Aurelien Lucchi
Jonas Köhler
54
0
0
23 Nov 2019
Automatic Differentiable Monte Carlo: Theory and Application
Shi-Xin Zhang
Z. Wan
H. Yao
57
17
0
20 Nov 2019
Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks
Aditya Golatkar
Alessandro Achille
Stefano Soatto
CLL
MU
114
508
0
12 Nov 2019
Optimizing Millions of Hyperparameters by Implicit Differentiation
Jonathan Lorraine
Paul Vicol
David Duvenaud
DD
139
417
0
06 Nov 2019
Stein Variational Gradient Descent With Matrix-Valued Kernels
Dilin Wang
Ziyang Tang
Minh Nguyen
Qiang Liu
90
62
0
28 Oct 2019
Kernelized Wasserstein Natural Gradient
Michael Arbel
Arthur Gretton
Wuchen Li
Guido Montúfar
68
23
0
21 Oct 2019
A Stochastic Extra-Step Quasi-Newton Method for Nonsmooth Nonconvex Optimization
Minghan Yang
Andre Milzarek
Zaiwen Wen
Tong Zhang
ODL
96
36
0
21 Oct 2019
On Warm-Starting Neural Network Training
Jordan T. Ash
Ryan P. Adams
AI4CE
58
21
0
18 Oct 2019
First-Order Preconditioning via Hypergradient Descent
Theodore H. Moskovitz
Rui Wang
Janice Lan
Sanyam Kapoor
Thomas Miconi
J. Yosinski
Aditya Rawal
AI4CE
81
8
0
18 Oct 2019
Pathological spectra of the Fisher information metric and its variants in deep neural networks
Ryo Karakida
S. Akaho
S. Amari
77
28
0
14 Oct 2019
On Empirical Comparisons of Optimizers for Deep Learning
Dami Choi
Christopher J. Shallue
Zachary Nado
Jaehoon Lee
Chris J. Maddison
George E. Dahl
132
259
0
11 Oct 2019
Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation
S. Meng
Sharan Vaswani
I. Laradji
Mark Schmidt
Simon Lacoste-Julien
102
34
0
11 Oct 2019
Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation
Konstantinos Pitas
75
8
0
06 Sep 2019
Accelerated Information Gradient flow
Yifei Wang
Wuchen Li
86
57
0
04 Sep 2019
Meta-Learning with Warped Gradient Descent
Sebastian Flennerhag
Andrei A. Rusu
Razvan Pascanu
Francesco Visin
Hujun Yin
R. Hadsell
110
210
0
30 Aug 2019
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
Lingxiao Wang
Qi Cai
Zhuoran Yang
Zhaoran Wang
113
242
0
29 Aug 2019
Variational Bayes on Manifolds
Minh-Ngoc Tran
D. Nguyen
Duy Nguyen
118
23
0
08 Aug 2019
Lookahead Optimizer: k steps forward, 1 step back
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
ODL
227
736
0
19 Jul 2019
Learning Neural Networks with Adaptive Regularization
Han Zhao
Yao-Hung Hubert Tsai
Ruslan Salakhutdinov
Geoffrey J. Gordon
52
15
0
14 Jul 2019
Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model
Guodong Zhang
Lala Li
Zachary Nado
James Martens
Sushant Sachdeva
George E. Dahl
Christopher J. Shallue
Roger C. Grosse
126
154
0
09 Jul 2019
Modern Deep Reinforcement Learning Algorithms
Sergey Ivanov
A. Dýakonov
OffRL
61
39
0
24 Jun 2019
Efficient Implementation of Second-Order Stochastic Approximation Algorithms in High-Dimensional Problems
Jingyi Zhu
Long Wang
J. Spall
46
14
0
23 Jun 2019
On the Noisy Gradient Descent that Generalizes as SGD
Jingfeng Wu
Wenqing Hu
Haoyi Xiong
Jun Huan
Vladimir Braverman
Zhanxing Zhu
MLT
73
10
0
18 Jun 2019
A Survey of Optimization Methods from a Machine Learning Perspective
Shiliang Sun
Zehui Cao
Han Zhu
Jing Zhao
88
566
0
17 Jun 2019
Training Neural Networks for and by Interpolation
Leonard Berrada
Andrew Zisserman
M. P. Kumar
3DH
74
63
0
13 Jun 2019
Non-Parametric Calibration for Classification
Jonathan Wenger
Hedvig Kjellström
Rudolph Triebel
UQCV
120
82
0
12 Jun 2019
The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks
Ryo Karakida
S. Akaho
S. Amari
73
41
0
07 Jun 2019
Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations
Wu Lin
Mohammad Emtiyaz Khan
Mark Schmidt
BDL
111
71
0
07 Jun 2019
Efficient Subsampled Gauss-Newton and Natural Gradient Methods for Training Neural Networks
Yi Ren
Shiqian Ma
64
37
0
05 Jun 2019
Neural Replicator Dynamics
Daniel Hennes
Dustin Morrill
Shayegan Omidshafiei
Rémi Munos
Julien Perolat
...
A. Gruslys
Jean-Baptiste Lespiau
Paavo Parmas
Edgar A. Duénez-Guzmán
K. Tuyls
74
16
0
01 Jun 2019
Matrix-Free Preconditioning in Online Learning
Ashok Cutkosky
Tamás Sarlós
ODL
98
16
0
29 May 2019
Limitations of the Empirical Fisher Approximation for Natural Gradient Descent
Frederik Kunstner
Lukas Balles
Philipp Hennig
101
219
0
29 May 2019
Network Deconvolution
Chengxi Ye
Matthew Evanusa
Hua He
A. Mitrokhin
Tom Goldstein
J. Yorke
Cornelia Fermuller
Yiannis Aloimonos
87
35
0
28 May 2019
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems
Tianle Cai
Ruiqi Gao
Jikai Hou
Siyu Chen
Dong Wang
Di He
Zhihua Zhang
Liwei Wang
ODL
76
57
0
28 May 2019
Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks
Guodong Zhang
James Martens
Roger C. Grosse
ODL
113
126
0
27 May 2019
Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire Evacuation Environment
Jivitesh Sharma
Per-Arne Andersen
Ole-Christoffer Granmo
M. G. Olsen
AI4CE
74
70
0
23 May 2019
Adaptive norms for deep learning with regularized Newton methods
Jonas Köhler
Leonard Adolphs
Aurelien Lucchi
ODL
45
12
0
22 May 2019
LAGC: Lazily Aggregated Gradient Coding for Straggler-Tolerant and Communication-Efficient Distributed Learning
Jingjing Zhang
Osvaldo Simeone
72
32
0
22 May 2019
EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis
Chaoqi Wang
Roger C. Grosse
Sanja Fidler
Guodong Zhang
80
124
0
15 May 2019
BayesNAS: A Bayesian Approach for Neural Architecture Search
Hongpeng Zhou
Minghao Yang
Jun Wang
Wei Pan
BDL
103
202
0
13 May 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
335
1,001
0
01 Apr 2019
Parabolic Approximation Line Search for DNNs
Max Mutschler
A. Zell
ODL
95
20
0
28 Mar 2019
Inefficiency of K-FAC for Large Batch Size Training
Linjian Ma
Gabe Montague
Jiayu Ye
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
58
24
0
14 Mar 2019
DeepOBS: A Deep Learning Optimizer Benchmark Suite
Frank Schneider
Lukas Balles
Philipp Hennig
ODL
129
71
0
13 Mar 2019
The Variational Predictive Natural Gradient
Da Tang
Rajesh Ranganath
BDL
DRL
43
10
0
07 Mar 2019
An Optimistic Acceleration of AMSGrad for Nonconvex Optimization
Jun-Kun Wang
Xiaoyun Li
Belhal Karimi
Ping Li
ODL
65
1
0
04 Mar 2019
A Regularized Approach to Sparse Optimal Policy in Reinforcement Learning
Xiang Li
Wenhao Yang
Zhihua Zhang
29
2
0
02 Mar 2019
Equi-normalization of Neural Networks
Pierre Stock
Benjamin Graham
Rémi Gribonval
Hervé Jégou
ODL
46
18
0
27 Feb 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise
Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
ODL
73
22
0
21 Feb 2019
Extreme Tensoring for Low-Memory Preconditioning
Xinyi Chen
Naman Agarwal
Elad Hazan
Cyril Zhang
Yi Zhang
63
11
0
12 Feb 2019
Previous
1
2
3
...
10
11
12
13
Next