v1v2v3v4v5v6v7 (latest)

Optimizing Neural Networks with Kronecker-factored Approximate Curvature

19 March 2015

Papers citing "Optimizing Neural Networks with Kronecker-factored Approximate Curvature"

50 / 645 papers shown

Title
A Sub-sampled Tensor Method for Non-convex Optimization Aurelien Lucchi Jonas Köhler 54 0 0 23 Nov 2019
Automatic Differentiable Monte Carlo: Theory and Application Shi-Xin Zhang Z. Wan H. Yao 57 17 0 20 Nov 2019
Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks Aditya Golatkar Alessandro Achille Stefano Soatto CLL MU 114 508 0 12 Nov 2019
Optimizing Millions of Hyperparameters by Implicit Differentiation Jonathan Lorraine Paul Vicol David Duvenaud DD 139 417 0 06 Nov 2019
Stein Variational Gradient Descent With Matrix-Valued Kernels Dilin Wang Ziyang Tang Minh Nguyen Qiang Liu 90 62 0 28 Oct 2019
Kernelized Wasserstein Natural Gradient Michael Arbel Arthur Gretton Wuchen Li Guido Montúfar 68 23 0 21 Oct 2019
A Stochastic Extra-Step Quasi-Newton Method for Nonsmooth Nonconvex Optimization Minghan Yang Andre Milzarek Zaiwen Wen Tong Zhang ODL 96 36 0 21 Oct 2019
On Warm-Starting Neural Network Training Jordan T. Ash Ryan P. Adams AI4CE 58 21 0 18 Oct 2019
First-Order Preconditioning via Hypergradient Descent Theodore H. Moskovitz Rui Wang Janice Lan Sanyam Kapoor Thomas Miconi J. Yosinski Aditya Rawal AI4CE 81 8 0 18 Oct 2019
Pathological spectra of the Fisher information metric and its variants in deep neural networks Ryo Karakida S. Akaho S. Amari 77 28 0 14 Oct 2019
On Empirical Comparisons of Optimizers for Deep Learning Dami Choi Christopher J. Shallue Zachary Nado Jaehoon Lee Chris J. Maddison George E. Dahl 132 259 0 11 Oct 2019
Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation S. Meng Sharan Vaswani I. Laradji Mark Schmidt Simon Lacoste-Julien 102 34 0 11 Oct 2019
Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation Konstantinos Pitas 75 8 0 06 Sep 2019
Accelerated Information Gradient flow Yifei Wang Wuchen Li 86 57 0 04 Sep 2019
Meta-Learning with Warped Gradient Descent Sebastian Flennerhag Andrei A. Rusu Razvan Pascanu Francesco Visin Hujun Yin R. Hadsell 110 210 0 30 Aug 2019
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence Lingxiao Wang Qi Cai Zhuoran Yang Zhaoran Wang 113 242 0 29 Aug 2019
Variational Bayes on Manifolds Minh-Ngoc Tran D. Nguyen Duy Nguyen 118 23 0 08 Aug 2019
Lookahead Optimizer: k steps forward, 1 step back Michael Ruogu Zhang James Lucas Geoffrey E. Hinton Jimmy Ba ODL 227 736 0 19 Jul 2019
Learning Neural Networks with Adaptive Regularization Han Zhao Yao-Hung Hubert Tsai Ruslan Salakhutdinov Geoffrey J. Gordon 52 15 0 14 Jul 2019
Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model Guodong Zhang Lala Li Zachary Nado James Martens Sushant Sachdeva George E. Dahl Christopher J. Shallue Roger C. Grosse 126 154 0 09 Jul 2019
Modern Deep Reinforcement Learning Algorithms Sergey Ivanov A. Dýakonov OffRL 61 39 0 24 Jun 2019
Efficient Implementation of Second-Order Stochastic Approximation Algorithms in High-Dimensional Problems Jingyi Zhu Long Wang J. Spall 46 14 0 23 Jun 2019
On the Noisy Gradient Descent that Generalizes as SGD Jingfeng Wu Wenqing Hu Haoyi Xiong Jun Huan Vladimir Braverman Zhanxing Zhu MLT 73 10 0 18 Jun 2019
A Survey of Optimization Methods from a Machine Learning Perspective Shiliang Sun Zehui Cao Han Zhu Jing Zhao 88 566 0 17 Jun 2019
Training Neural Networks for and by Interpolation Leonard Berrada Andrew Zisserman M. P. Kumar 3DH 74 63 0 13 Jun 2019
Non-Parametric Calibration for Classification Jonathan Wenger Hedvig Kjellström Rudolph Triebel UQCV 120 82 0 12 Jun 2019
The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks Ryo Karakida S. Akaho S. Amari 73 41 0 07 Jun 2019
Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations Wu Lin Mohammad Emtiyaz Khan Mark Schmidt BDL 111 71 0 07 Jun 2019
Efficient Subsampled Gauss-Newton and Natural Gradient Methods for Training Neural Networks Yi Ren Shiqian Ma 64 37 0 05 Jun 2019
Neural Replicator Dynamics Daniel Hennes Dustin Morrill Shayegan Omidshafiei Rémi Munos Julien Perolat ... A. Gruslys Jean-Baptiste Lespiau Paavo Parmas Edgar A. Duénez-Guzmán K. Tuyls 74 16 0 01 Jun 2019
Matrix-Free Preconditioning in Online Learning Ashok Cutkosky Tamás Sarlós ODL 98 16 0 29 May 2019
Limitations of the Empirical Fisher Approximation for Natural Gradient Descent Frederik Kunstner Lukas Balles Philipp Hennig 101 219 0 29 May 2019
Network Deconvolution Chengxi Ye Matthew Evanusa Hua He A. Mitrokhin Tom Goldstein J. Yorke Cornelia Fermuller Yiannis Aloimonos 87 35 0 28 May 2019
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems Tianle Cai Ruiqi Gao Jikai Hou Siyu Chen Dong Wang Di He Zhihua Zhang Liwei Wang ODL 76 57 0 28 May 2019
Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks Guodong Zhang James Martens Roger C. Grosse ODL 113 126 0 27 May 2019
Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire Evacuation Environment Jivitesh Sharma Per-Arne Andersen Ole-Christoffer Granmo M. G. Olsen AI4CE 74 70 0 23 May 2019
Adaptive norms for deep learning with regularized Newton methods Jonas Köhler Leonard Adolphs Aurelien Lucchi ODL 45 12 0 22 May 2019
LAGC: Lazily Aggregated Gradient Coding for Straggler-Tolerant and Communication-Efficient Distributed Learning Jingjing Zhang Osvaldo Simeone 72 32 0 22 May 2019
EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis Chaoqi Wang Roger C. Grosse Sanja Fidler Guodong Zhang 80 124 0 15 May 2019
BayesNAS: A Bayesian Approach for Neural Architecture Search Hongpeng Zhou Minghao Yang Jun Wang Wei Pan BDL 103 202 0 13 May 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes Yang You Jing Li Sashank J. Reddi Jonathan Hseu Sanjiv Kumar Srinadh Bhojanapalli Xiaodan Song J. Demmel Kurt Keutzer Cho-Jui Hsieh ODL 335 1,001 0 01 Apr 2019
Parabolic Approximation Line Search for DNNs Max Mutschler A. Zell ODL 95 20 0 28 Mar 2019
Inefficiency of K-FAC for Large Batch Size Training Linjian Ma Gabe Montague Jiayu Ye Z. Yao A. Gholami Kurt Keutzer Michael W. Mahoney 58 24 0 14 Mar 2019
DeepOBS: A Deep Learning Optimizer Benchmark Suite Frank Schneider Lukas Balles Philipp Hennig ODL 129 71 0 13 Mar 2019
The Variational Predictive Natural Gradient Da Tang Rajesh Ranganath BDL DRL 43 10 0 07 Mar 2019
An Optimistic Acceleration of AMSGrad for Nonconvex Optimization Jun-Kun Wang Xiaoyun Li Belhal Karimi Ping Li ODL 65 1 0 04 Mar 2019
A Regularized Approach to Sparse Optimal Policy in Reinforcement Learning Xiang Li Wenhao Yang Zhihua Zhang 29 2 0 02 Mar 2019
Equi-normalization of Neural Networks Pierre Stock Benjamin Graham Rémi Gribonval Hervé Jégou ODL 46 18 0 27 Feb 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise Yeming Wen Kevin Luk Maxime Gazeau Guodong Zhang Harris Chan Jimmy Ba ODL 73 22 0 21 Feb 2019
Extreme Tensoring for Low-Memory Preconditioning Xinyi Chen Naman Agarwal Elad Hazan Cyril Zhang Yi Zhang 63 11 0 12 Feb 2019