A Mini-Block Fisher Method for Deep Neural Networks

v1v2v3v4 (latest)

A Mini-Block Fisher Method for Deep Neural Networks

8 February 2022

ArXiv (abs)PDF HTML

Papers citing "A Mini-Block Fisher Method for Deep Neural Networks"

18 / 18 papers shown

Title
An Efficient Nonlinear Acceleration method that Exploits Symmetry of the Hessian Huan He Shifan Zhao Z. Tang Joyce C. Ho Y. Saad Yuanzhe Xi 86 3 0 22 Oct 2022
Tensor Normal Training for Deep Learning Models Yi Ren Shiqian Ma 96 28 0 05 Jun 2021
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers Robin M. Schmidt Frank Schneider Philipp Hennig ODL 134 168 0 03 Jul 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning Z. Yao A. Gholami Sheng Shen Mustafa Mustafa Kurt Keutzer Michael W. Mahoney ODL 123 287 0 01 Jun 2020
On Empirical Comparisons of Optimizers for Deep Learning Dami Choi Christopher J. Shallue Zachary Nado Jaehoon Lee Chris J. Maddison George E. Dahl 109 259 0 11 Oct 2019
Limitations of the Empirical Fisher Approximation for Natural Gradient Descent Frederik Kunstner Lukas Balles Philipp Hennig 96 219 0 29 May 2019
Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks Guodong Zhang James Martens Roger C. Grosse ODL 97 126 0 27 May 2019
Measuring the Effects of Data Parallelism on Neural Network Training Christopher J. Shallue Jaehoon Lee J. Antognini J. Mamou J. Ketterling Yao Wang 100 409 0 08 Nov 2018
Fisher Information and Natural Gradient Learning of Random Deep Networks S. Amari Ryo Karakida Masafumi Oizumi 68 36 0 22 Aug 2018
Nonlinear Acceleration of CNNs Damien Scieur Edouard Oyallon Alexandre d’Aspremont Francis R. Bach 31 11 0 01 Jun 2018
Shampoo: Preconditioned Stochastic Tensor Optimization Vineet Gupta Tomer Koren Y. Singer ODL 105 226 0 26 Feb 2018
FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling Jie Chen Tengfei Ma Cao Xiao GNN 154 1,517 0 30 Jan 2018
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms Han Xiao Kashif Rasul Roland Vollgraf 285 8,928 0 25 Aug 2017
Practical Gauss-Newton Optimisation for Deep Learning Aleksandar Botev H. Ritter David Barber ODL 76 232 0 12 Jun 2017
Optimizing Neural Networks with Kronecker-factored Approximate Curvature James Martens Roger C. Grosse ODL 113 1,024 0 19 Mar 2015
A Stochastic Quasi-Newton Method for Large-Scale Optimization R. Byrd Samantha Hansen J. Nocedal Y. Singer ODL 119 473 0 27 Jan 2014
Riemannian metrics for neural networks I: feedforward networks Yann Ollivier 90 104 0 04 Mar 2013
Krylov Subspace Descent for Deep Learning Oriol Vinyals Daniel Povey ODL 86 148 0 18 Nov 2011