ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1503.05671
  4. Cited By
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
v1v2v3v4v5v6v7 (latest)

Optimizing Neural Networks with Kronecker-factored Approximate Curvature

19 March 2015
James Martens
Roger C. Grosse
    ODL
ArXiv (abs)PDFHTML

Papers citing "Optimizing Neural Networks with Kronecker-factored Approximate Curvature"

45 / 645 papers shown
Title
A Walk with SGD
A Walk with SGD
Chen Xing
Devansh Arpit
Christos Tsirigotis
Yoshua Bengio
98
119
0
24 Feb 2018
A DIRT-T Approach to Unsupervised Domain Adaptation
A DIRT-T Approach to Unsupervised Domain Adaptation
Rui Shu
Hung Bui
Hirokazu Narui
Stefano Ermon
80
629
0
23 Feb 2018
EA-CG: An Approximate Second-Order Method for Training Fully-Connected
  Neural Networks
EA-CG: An Approximate Second-Order Method for Training Fully-Connected Neural Networks
Sheng-Wei Chen
Chun-Nan Chou
Edward Y. Chang
37
5
0
19 Feb 2018
A Progressive Batching L-BFGS Method for Machine Learning
A Progressive Batching L-BFGS Method for Machine Learning
Raghu Bollapragada
Dheevatsa Mudigere
J. Nocedal
Hao-Jun Michael Shi
P. T. P. Tang
ODL
114
153
0
15 Feb 2018
Riemannian Walk for Incremental Learning: Understanding Forgetting and
  Intransigence
Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence
Arslan Chaudhry
P. Dokania
Thalaiyasingam Ajanthan
Philip Torr
CLL
174
1,148
0
30 Jan 2018
Recasting Gradient-Based Meta-Learning as Hierarchical Bayes
Recasting Gradient-Based Meta-Learning as Hierarchical Bayes
Erin Grant
Chelsea Finn
Sergey Levine
Trevor Darrell
Thomas Griffiths
BDL
107
510
0
26 Jan 2018
Rover Descent: Learning to optimize by learning to navigate on
  prototypical loss surfaces
Rover Descent: Learning to optimize by learning to navigate on prototypical loss surfaces
Louis Faury
Flavian Vasile
37
2
0
22 Jan 2018
An Empirical Analysis of Proximal Policy Optimization with
  Kronecker-factored Natural Gradients
An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients
Jiaming Song
Yuhuai Wu
39
2
0
17 Jan 2018
True Asymptotic Natural Gradient Optimization
True Asymptotic Natural Gradient Optimization
Yann Ollivier
ODL
32
12
0
22 Dec 2017
Block-diagonal Hessian-free Optimization for Training Neural Networks
Block-diagonal Hessian-free Optimization for Training Neural Networks
Huishuai Zhang
Caiming Xiong
James Bradbury
R. Socher
ODL
60
22
0
20 Dec 2017
Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural
  Networks
Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks
Shankar Krishnan
Ying Xiao
Rif A. Saurous
ODL
45
20
0
08 Dec 2017
Noisy Natural Gradient as Variational Inference
Noisy Natural Gradient as Variational Inference
Guodong Zhang
Shengyang Sun
David Duvenaud
Roger C. Grosse
ODL
111
212
0
06 Dec 2017
Critical Learning Periods in Deep Neural Networks
Critical Learning Periods in Deep Neural Networks
Alessandro Achille
Matteo Rovere
Stefano Soatto
72
100
0
24 Nov 2017
Fisher-Rao Metric, Geometry, and Complexity of Neural Networks
Fisher-Rao Metric, Geometry, and Complexity of Neural Networks
Tengyuan Liang
T. Poggio
Alexander Rakhlin
J. Stokes
109
226
0
05 Nov 2017
Don't Decay the Learning Rate, Increase the Batch Size
Don't Decay the Learning Rate, Increase the Batch Size
Samuel L. Smith
Pieter-Jan Kindermans
Chris Ying
Quoc V. Le
ODL
133
996
0
01 Nov 2017
Riemannian approach to batch normalization
Riemannian approach to batch normalization
Minhyung Cho
Jaehyung Lee
87
94
0
27 Sep 2017
Implicit Regularization in Deep Learning
Implicit Regularization in Deep Learning
Behnam Neyshabur
96
148
0
06 Sep 2017
A Generic Approach for Escaping Saddle points
A Generic Approach for Escaping Saddle points
Sashank J. Reddi
Manzil Zaheer
S. Sra
Barnabás Póczós
Francis R. Bach
Ruslan Salakhutdinov
Alex Smola
122
84
0
05 Sep 2017
Scalable trust-region method for deep reinforcement learning using
  Kronecker-factored approximation
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Yuhuai Wu
Elman Mansimov
Shun Liao
Roger C. Grosse
Jimmy Ba
OffRL
152
631
0
17 Aug 2017
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
91
107
0
06 Jul 2017
Practical Gauss-Newton Optimisation for Deep Learning
Practical Gauss-Newton Optimisation for Deep Learning
Aleksandar Botev
H. Ritter
David Barber
ODL
105
232
0
12 Jun 2017
Training Quantized Nets: A Deeper Understanding
Training Quantized Nets: A Deeper Understanding
Hao Li
Soham De
Zheng Xu
Christoph Studer
H. Samet
Tom Goldstein
MQ
91
211
0
07 Jun 2017
Kronecker Recurrent Units
Kronecker Recurrent Units
C. Jose
Moustapha Cissé
François Fleuret
ODL
141
46
0
29 May 2017
Diagonal Rescaling For Neural Networks
Diagonal Rescaling For Neural Networks
Jean Lafond
Nicolas Vasilache
Léon Bottou
67
11
0
25 May 2017
A Neural Network model with Bidirectional Whitening
A Neural Network model with Bidirectional Whitening
Y. Fujimoto
T. Ohira
53
4
0
24 Apr 2017
Online Natural Gradient as a Kalman Filter
Online Natural Gradient as a Kalman Filter
Yann Ollivier
105
68
0
01 Mar 2017
Scalable Adaptive Stochastic Optimization Using Random Projections
Scalable Adaptive Stochastic Optimization Using Random Projections
Gabriel Krummenacher
Brian McWilliams
Yannic Kilcher
J. M. Buhmann
N. Meinshausen
ODL
60
17
0
21 Nov 2016
Trusting SVM for Piecewise Linear CNNs
Trusting SVM for Piecewise Linear CNNs
Leonard Berrada
Andrew Zisserman
M. P. Kumar
74
11
0
07 Nov 2016
Relative Natural Gradient for Learning Large Complex Models
Relative Natural Gradient for Learning Large Complex Models
Ke Sun
Frank Nielsen
68
5
0
20 Jun 2016
On the Expressive Power of Deep Neural Networks
On the Expressive Power of Deep Neural Networks
M. Raghu
Ben Poole
Jon M. Kleinberg
Surya Ganguli
Jascha Narain Sohl-Dickstein
108
791
0
16 Jun 2016
Learning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descent
Marcin Andrychowicz
Misha Denil
Sergio Gomez Colmenarejo
Matthew W. Hoffman
David Pfau
Tom Schaul
Brendan Shillingford
Nando de Freitas
139
2,010
0
14 Jun 2016
Kronecker Determinantal Point Processes
Kronecker Determinantal Point Processes
Zelda E. Mariet
S. Sra
67
31
0
26 May 2016
Composing graphical models with neural networks for structured
  representations and fast inference
Composing graphical models with neural networks for structured representations and fast inference
Matthew J. Johnson
David Duvenaud
Alexander B. Wiltschko
S. R. Datta
Ryan P. Adams
BDLOCL
123
486
0
20 Mar 2016
Weight Normalization: A Simple Reparameterization to Accelerate Training
  of Deep Neural Networks
Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
Tim Salimans
Diederik P. Kingma
ODL
219
1,949
0
25 Feb 2016
Learning values across many orders of magnitude
Learning values across many orders of magnitude
H. V. Hasselt
A. Guez
Matteo Hessel
Volodymyr Mnih
David Silver
88
170
0
24 Feb 2016
Patterns of Scalable Bayesian Inference
Patterns of Scalable Bayesian Inference
E. Angelino
Matthew J. Johnson
Ryan P. Adams
114
87
0
16 Feb 2016
Improved Dropout for Shallow and Deep Learning
Improved Dropout for Shallow and Deep Learning
Zhe Li
Boqing Gong
Tianbao Yang
BDLSyDa
101
78
0
06 Feb 2016
A Kronecker-factored approximate Fisher matrix for convolution layers
A Kronecker-factored approximate Fisher matrix for convolution layers
Roger C. Grosse
James Martens
ODL
112
265
0
03 Feb 2016
Preconditioned Stochastic Gradient Descent
Preconditioned Stochastic Gradient Descent
Xi-Lin Li
62
96
0
14 Dec 2015
Adding Gradient Noise Improves Learning for Very Deep Networks
Adding Gradient Noise Improves Learning for Very Deep Networks
Arvind Neelakantan
Luke Vilnis
Quoc V. Le
Ilya Sutskever
Lukasz Kaiser
Karol Kurach
James Martens
AI4CEODL
85
545
0
21 Nov 2015
Data-Dependent Path Normalization in Neural Networks
Data-Dependent Path Normalization in Neural Networks
Behnam Neyshabur
Ryota Tomioka
Ruslan Salakhutdinov
Nathan Srebro
110
22
0
20 Nov 2015
adaQN: An Adaptive Quasi-Newton Algorithm for Training RNNs
adaQN: An Adaptive Quasi-Newton Algorithm for Training RNNs
N. Keskar
A. Berahas
ODL
86
35
0
04 Nov 2015
Natural Neural Networks
Natural Neural Networks
Guillaume Desjardins
Karen Simonyan
Razvan Pascanu
Koray Kavukcuoglu
131
176
0
01 Jul 2015
Path-SGD: Path-Normalized Optimization in Deep Neural Networks
Path-SGD: Path-Normalized Optimization in Deep Neural Networks
Behnam Neyshabur
Ruslan Salakhutdinov
Nathan Srebro
ODL
105
310
0
08 Jun 2015
New insights and perspectives on the natural gradient method
New insights and perspectives on the natural gradient method
James Martens
ODL
233
631
0
03 Dec 2014
Previous
123...111213