v1v2v3v4v5v6v7 (latest)

Optimizing Neural Networks with Kronecker-factored Approximate Curvature

19 March 2015

Papers citing "Optimizing Neural Networks with Kronecker-factored Approximate Curvature"

45 / 645 papers shown

Title
A Walk with SGD Chen Xing Devansh Arpit Christos Tsirigotis Yoshua Bengio 98 119 0 24 Feb 2018
A DIRT-T Approach to Unsupervised Domain Adaptation Rui Shu Hung Bui Hirokazu Narui Stefano Ermon 80 629 0 23 Feb 2018
EA-CG: An Approximate Second-Order Method for Training Fully-Connected Neural Networks Sheng-Wei Chen Chun-Nan Chou Edward Y. Chang 37 5 0 19 Feb 2018
A Progressive Batching L-BFGS Method for Machine Learning Raghu Bollapragada Dheevatsa Mudigere J. Nocedal Hao-Jun Michael Shi P. T. P. Tang ODL 114 153 0 15 Feb 2018
Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence Arslan Chaudhry P. Dokania Thalaiyasingam Ajanthan Philip Torr CLL 174 1,148 0 30 Jan 2018
Recasting Gradient-Based Meta-Learning as Hierarchical Bayes Erin Grant Chelsea Finn Sergey Levine Trevor Darrell Thomas Griffiths BDL 107 510 0 26 Jan 2018
Rover Descent: Learning to optimize by learning to navigate on prototypical loss surfaces Louis Faury Flavian Vasile 37 2 0 22 Jan 2018
An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients Jiaming Song Yuhuai Wu 39 2 0 17 Jan 2018
True Asymptotic Natural Gradient Optimization Yann Ollivier ODL 32 12 0 22 Dec 2017
Block-diagonal Hessian-free Optimization for Training Neural Networks Huishuai Zhang Caiming Xiong James Bradbury R. Socher ODL 60 22 0 20 Dec 2017
Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks Shankar Krishnan Ying Xiao Rif A. Saurous ODL 45 20 0 08 Dec 2017
Noisy Natural Gradient as Variational Inference Guodong Zhang Shengyang Sun David Duvenaud Roger C. Grosse ODL 111 212 0 06 Dec 2017
Critical Learning Periods in Deep Neural Networks Alessandro Achille Matteo Rovere Stefano Soatto 72 100 0 24 Nov 2017
Fisher-Rao Metric, Geometry, and Complexity of Neural Networks Tengyuan Liang T. Poggio Alexander Rakhlin J. Stokes 109 226 0 05 Nov 2017
Don't Decay the Learning Rate, Increase the Batch Size Samuel L. Smith Pieter-Jan Kindermans Chris Ying Quoc V. Le ODL 133 996 0 01 Nov 2017
Riemannian approach to batch normalization Minhyung Cho Jaehyung Lee 87 94 0 27 Sep 2017
Implicit Regularization in Deep Learning Behnam Neyshabur 96 148 0 06 Sep 2017
A Generic Approach for Escaping Saddle points Sashank J. Reddi Manzil Zaheer S. Sra Barnabás Póczós Francis R. Bach Ruslan Salakhutdinov Alex Smola 122 84 0 05 Sep 2017
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation Yuhuai Wu Elman Mansimov Shun Liao Roger C. Grosse Jimmy Ba OffRL 152 631 0 17 Aug 2017
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control Ofir Nachum Mohammad Norouzi Kelvin Xu Dale Schuurmans 91 107 0 06 Jul 2017
Practical Gauss-Newton Optimisation for Deep Learning Aleksandar Botev H. Ritter David Barber ODL 105 232 0 12 Jun 2017
Training Quantized Nets: A Deeper Understanding Hao Li Soham De Zheng Xu Christoph Studer H. Samet Tom Goldstein MQ 91 211 0 07 Jun 2017
Kronecker Recurrent Units C. Jose Moustapha Cissé François Fleuret ODL 141 46 0 29 May 2017
Diagonal Rescaling For Neural Networks Jean Lafond Nicolas Vasilache Léon Bottou 67 11 0 25 May 2017
A Neural Network model with Bidirectional Whitening Y. Fujimoto T. Ohira 53 4 0 24 Apr 2017
Online Natural Gradient as a Kalman Filter Yann Ollivier 105 68 0 01 Mar 2017
Scalable Adaptive Stochastic Optimization Using Random Projections Gabriel Krummenacher Brian McWilliams Yannic Kilcher J. M. Buhmann N. Meinshausen ODL 60 17 0 21 Nov 2016
Trusting SVM for Piecewise Linear CNNs Leonard Berrada Andrew Zisserman M. P. Kumar 74 11 0 07 Nov 2016
Relative Natural Gradient for Learning Large Complex Models Ke Sun Frank Nielsen 68 5 0 20 Jun 2016
On the Expressive Power of Deep Neural Networks M. Raghu Ben Poole Jon M. Kleinberg Surya Ganguli Jascha Narain Sohl-Dickstein 108 791 0 16 Jun 2016
Learning to learn by gradient descent by gradient descent Marcin Andrychowicz Misha Denil Sergio Gomez Colmenarejo Matthew W. Hoffman David Pfau Tom Schaul Brendan Shillingford Nando de Freitas 139 2,010 0 14 Jun 2016
Kronecker Determinantal Point Processes Zelda E. Mariet S. Sra 67 31 0 26 May 2016
Composing graphical models with neural networks for structured representations and fast inference Matthew J. Johnson David Duvenaud Alexander B. Wiltschko S. R. Datta Ryan P. Adams BDL OCL 123 486 0 20 Mar 2016
Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks Tim Salimans Diederik P. Kingma ODL 219 1,949 0 25 Feb 2016
Learning values across many orders of magnitude H. V. Hasselt A. Guez Matteo Hessel Volodymyr Mnih David Silver 88 170 0 24 Feb 2016
Patterns of Scalable Bayesian Inference E. Angelino Matthew J. Johnson Ryan P. Adams 114 87 0 16 Feb 2016
Improved Dropout for Shallow and Deep Learning Zhe Li Boqing Gong Tianbao Yang BDL SyDa 101 78 0 06 Feb 2016
A Kronecker-factored approximate Fisher matrix for convolution layers Roger C. Grosse James Martens ODL 112 265 0 03 Feb 2016
Preconditioned Stochastic Gradient Descent Xi-Lin Li 62 96 0 14 Dec 2015
Adding Gradient Noise Improves Learning for Very Deep Networks Arvind Neelakantan Luke Vilnis Quoc V. Le Ilya Sutskever Lukasz Kaiser Karol Kurach James Martens AI4CE ODL 85 545 0 21 Nov 2015
Data-Dependent Path Normalization in Neural Networks Behnam Neyshabur Ryota Tomioka Ruslan Salakhutdinov Nathan Srebro 110 22 0 20 Nov 2015
adaQN: An Adaptive Quasi-Newton Algorithm for Training RNNs N. Keskar A. Berahas ODL 86 35 0 04 Nov 2015
Natural Neural Networks Guillaume Desjardins Karen Simonyan Razvan Pascanu Koray Kavukcuoglu 131 176 0 01 Jul 2015
Path-SGD: Path-Normalized Optimization in Deep Neural Networks Behnam Neyshabur Ruslan Salakhutdinov Nathan Srebro ODL 105 310 0 08 Jun 2015
New insights and perspectives on the natural gradient method James Martens ODL 233 631 0 03 Dec 2014