ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1503.05671
  4. Cited By
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
v1v2v3v4v5v6v7 (latest)

Optimizing Neural Networks with Kronecker-factored Approximate Curvature

19 March 2015
James Martens
Roger C. Grosse
    ODL
ArXiv (abs)PDFHTML

Papers citing "Optimizing Neural Networks with Kronecker-factored Approximate Curvature"

50 / 645 papers shown
Title
A Sub-sampled Tensor Method for Non-convex Optimization
A Sub-sampled Tensor Method for Non-convex Optimization
Aurelien Lucchi
Jonas Köhler
54
0
0
23 Nov 2019
Automatic Differentiable Monte Carlo: Theory and Application
Automatic Differentiable Monte Carlo: Theory and Application
Shi-Xin Zhang
Z. Wan
H. Yao
57
17
0
20 Nov 2019
Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep
  Networks
Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks
Aditya Golatkar
Alessandro Achille
Stefano Soatto
CLLMU
114
508
0
12 Nov 2019
Optimizing Millions of Hyperparameters by Implicit Differentiation
Optimizing Millions of Hyperparameters by Implicit Differentiation
Jonathan Lorraine
Paul Vicol
David Duvenaud
DD
139
417
0
06 Nov 2019
Stein Variational Gradient Descent With Matrix-Valued Kernels
Stein Variational Gradient Descent With Matrix-Valued Kernels
Dilin Wang
Ziyang Tang
Minh Nguyen
Qiang Liu
90
62
0
28 Oct 2019
Kernelized Wasserstein Natural Gradient
Kernelized Wasserstein Natural Gradient
Michael Arbel
Arthur Gretton
Wuchen Li
Guido Montúfar
68
23
0
21 Oct 2019
A Stochastic Extra-Step Quasi-Newton Method for Nonsmooth Nonconvex
  Optimization
A Stochastic Extra-Step Quasi-Newton Method for Nonsmooth Nonconvex Optimization
Minghan Yang
Andre Milzarek
Zaiwen Wen
Tong Zhang
ODL
96
36
0
21 Oct 2019
On Warm-Starting Neural Network Training
On Warm-Starting Neural Network Training
Jordan T. Ash
Ryan P. Adams
AI4CE
58
21
0
18 Oct 2019
First-Order Preconditioning via Hypergradient Descent
First-Order Preconditioning via Hypergradient Descent
Theodore H. Moskovitz
Rui Wang
Janice Lan
Sanyam Kapoor
Thomas Miconi
J. Yosinski
Aditya Rawal
AI4CE
81
8
0
18 Oct 2019
Pathological spectra of the Fisher information metric and its variants
  in deep neural networks
Pathological spectra of the Fisher information metric and its variants in deep neural networks
Ryo Karakida
S. Akaho
S. Amari
77
28
0
14 Oct 2019
On Empirical Comparisons of Optimizers for Deep Learning
On Empirical Comparisons of Optimizers for Deep Learning
Dami Choi
Christopher J. Shallue
Zachary Nado
Jaehoon Lee
Chris J. Maddison
George E. Dahl
132
259
0
11 Oct 2019
Fast and Furious Convergence: Stochastic Second Order Methods under
  Interpolation
Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation
S. Meng
Sharan Vaswani
I. Laradji
Mark Schmidt
Simon Lacoste-Julien
102
34
0
11 Oct 2019
Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field
  Approximation
Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation
Konstantinos Pitas
75
8
0
06 Sep 2019
Accelerated Information Gradient flow
Accelerated Information Gradient flow
Yifei Wang
Wuchen Li
86
57
0
04 Sep 2019
Meta-Learning with Warped Gradient Descent
Meta-Learning with Warped Gradient Descent
Sebastian Flennerhag
Andrei A. Rusu
Razvan Pascanu
Francesco Visin
Hujun Yin
R. Hadsell
110
210
0
30 Aug 2019
Neural Policy Gradient Methods: Global Optimality and Rates of
  Convergence
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
Lingxiao Wang
Qi Cai
Zhuoran Yang
Zhaoran Wang
113
242
0
29 Aug 2019
Variational Bayes on Manifolds
Variational Bayes on Manifolds
Minh-Ngoc Tran
D. Nguyen
Duy Nguyen
118
23
0
08 Aug 2019
Lookahead Optimizer: k steps forward, 1 step back
Lookahead Optimizer: k steps forward, 1 step back
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
ODL
227
736
0
19 Jul 2019
Learning Neural Networks with Adaptive Regularization
Learning Neural Networks with Adaptive Regularization
Han Zhao
Yao-Hung Hubert Tsai
Ruslan Salakhutdinov
Geoffrey J. Gordon
52
15
0
14 Jul 2019
Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a
  Noisy Quadratic Model
Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model
Guodong Zhang
Lala Li
Zachary Nado
James Martens
Sushant Sachdeva
George E. Dahl
Christopher J. Shallue
Roger C. Grosse
126
154
0
09 Jul 2019
Modern Deep Reinforcement Learning Algorithms
Modern Deep Reinforcement Learning Algorithms
Sergey Ivanov
A. Dýakonov
OffRL
61
39
0
24 Jun 2019
Efficient Implementation of Second-Order Stochastic Approximation
  Algorithms in High-Dimensional Problems
Efficient Implementation of Second-Order Stochastic Approximation Algorithms in High-Dimensional Problems
Jingyi Zhu
Long Wang
J. Spall
46
14
0
23 Jun 2019
On the Noisy Gradient Descent that Generalizes as SGD
On the Noisy Gradient Descent that Generalizes as SGD
Jingfeng Wu
Wenqing Hu
Haoyi Xiong
Jun Huan
Vladimir Braverman
Zhanxing Zhu
MLT
73
10
0
18 Jun 2019
A Survey of Optimization Methods from a Machine Learning Perspective
A Survey of Optimization Methods from a Machine Learning Perspective
Shiliang Sun
Zehui Cao
Han Zhu
Jing Zhao
88
566
0
17 Jun 2019
Training Neural Networks for and by Interpolation
Training Neural Networks for and by Interpolation
Leonard Berrada
Andrew Zisserman
M. P. Kumar
3DH
74
63
0
13 Jun 2019
Non-Parametric Calibration for Classification
Non-Parametric Calibration for Classification
Jonathan Wenger
Hedvig Kjellström
Rudolph Triebel
UQCV
120
82
0
12 Jun 2019
The Normalization Method for Alleviating Pathological Sharpness in Wide
  Neural Networks
The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks
Ryo Karakida
S. Akaho
S. Amari
73
41
0
07 Jun 2019
Fast and Simple Natural-Gradient Variational Inference with Mixture of
  Exponential-family Approximations
Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations
Wu Lin
Mohammad Emtiyaz Khan
Mark Schmidt
BDL
111
71
0
07 Jun 2019
Efficient Subsampled Gauss-Newton and Natural Gradient Methods for
  Training Neural Networks
Efficient Subsampled Gauss-Newton and Natural Gradient Methods for Training Neural Networks
Yi Ren
Shiqian Ma
64
37
0
05 Jun 2019
Neural Replicator Dynamics
Neural Replicator Dynamics
Daniel Hennes
Dustin Morrill
Shayegan Omidshafiei
Rémi Munos
Julien Perolat
...
A. Gruslys
Jean-Baptiste Lespiau
Paavo Parmas
Edgar A. Duénez-Guzmán
K. Tuyls
74
16
0
01 Jun 2019
Matrix-Free Preconditioning in Online Learning
Matrix-Free Preconditioning in Online Learning
Ashok Cutkosky
Tamás Sarlós
ODL
98
16
0
29 May 2019
Limitations of the Empirical Fisher Approximation for Natural Gradient
  Descent
Limitations of the Empirical Fisher Approximation for Natural Gradient Descent
Frederik Kunstner
Lukas Balles
Philipp Hennig
101
219
0
29 May 2019
Network Deconvolution
Network Deconvolution
Chengxi Ye
Matthew Evanusa
Hua He
A. Mitrokhin
Tom Goldstein
J. Yorke
Cornelia Fermuller
Yiannis Aloimonos
87
35
0
28 May 2019
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for
  Regression Problems
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems
Tianle Cai
Ruiqi Gao
Jikai Hou
Siyu Chen
Dong Wang
Di He
Zhihua Zhang
Liwei Wang
ODL
76
57
0
28 May 2019
Fast Convergence of Natural Gradient Descent for Overparameterized
  Neural Networks
Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks
Guodong Zhang
James Martens
Roger C. Grosse
ODL
113
126
0
27 May 2019
Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire
  Evacuation Environment
Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire Evacuation Environment
Jivitesh Sharma
Per-Arne Andersen
Ole-Christoffer Granmo
M. G. Olsen
AI4CE
74
70
0
23 May 2019
Adaptive norms for deep learning with regularized Newton methods
Adaptive norms for deep learning with regularized Newton methods
Jonas Köhler
Leonard Adolphs
Aurelien Lucchi
ODL
45
12
0
22 May 2019
LAGC: Lazily Aggregated Gradient Coding for Straggler-Tolerant and
  Communication-Efficient Distributed Learning
LAGC: Lazily Aggregated Gradient Coding for Straggler-Tolerant and Communication-Efficient Distributed Learning
Jingjing Zhang
Osvaldo Simeone
72
32
0
22 May 2019
EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis
EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis
Chaoqi Wang
Roger C. Grosse
Sanja Fidler
Guodong Zhang
80
124
0
15 May 2019
BayesNAS: A Bayesian Approach for Neural Architecture Search
BayesNAS: A Bayesian Approach for Neural Architecture Search
Hongpeng Zhou
Minghao Yang
Jun Wang
Wei Pan
BDL
103
202
0
13 May 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
335
1,001
0
01 Apr 2019
Parabolic Approximation Line Search for DNNs
Parabolic Approximation Line Search for DNNs
Max Mutschler
A. Zell
ODL
95
20
0
28 Mar 2019
Inefficiency of K-FAC for Large Batch Size Training
Inefficiency of K-FAC for Large Batch Size Training
Linjian Ma
Gabe Montague
Jiayu Ye
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
58
24
0
14 Mar 2019
DeepOBS: A Deep Learning Optimizer Benchmark Suite
DeepOBS: A Deep Learning Optimizer Benchmark Suite
Frank Schneider
Lukas Balles
Philipp Hennig
ODL
129
71
0
13 Mar 2019
The Variational Predictive Natural Gradient
The Variational Predictive Natural Gradient
Da Tang
Rajesh Ranganath
BDLDRL
43
10
0
07 Mar 2019
An Optimistic Acceleration of AMSGrad for Nonconvex Optimization
An Optimistic Acceleration of AMSGrad for Nonconvex Optimization
Jun-Kun Wang
Xiaoyun Li
Belhal Karimi
Ping Li
ODL
65
1
0
04 Mar 2019
A Regularized Approach to Sparse Optimal Policy in Reinforcement
  Learning
A Regularized Approach to Sparse Optimal Policy in Reinforcement Learning
Xiang Li
Wenhao Yang
Zhihua Zhang
29
2
0
02 Mar 2019
Equi-normalization of Neural Networks
Equi-normalization of Neural Networks
Pierre Stock
Benjamin Graham
Rémi Gribonval
Hervé Jégou
ODL
46
18
0
27 Feb 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with
  Structured Covariance Noise
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise
Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
ODL
73
22
0
21 Feb 2019
Extreme Tensoring for Low-Memory Preconditioning
Extreme Tensoring for Low-Memory Preconditioning
Xinyi Chen
Naman Agarwal
Elad Hazan
Cyril Zhang
Yi Zhang
63
11
0
12 Feb 2019
Previous
123...10111213
Next