Bayesian filtering unifies adaptive and non-adaptive neural network optimization methods

19 July 2018

Papers citing "Bayesian filtering unifies adaptive and non-adaptive neural network optimization methods"

19 / 19 papers shown

Title
On the Convergence of Adam and Beyond Sashank J. Reddi Satyen Kale Surinder Kumar 106 2,506 0 19 Apr 2019
Adaptive Gradient Methods with Dynamic Bound of Learning Rate Liangchen Luo Yuanhao Xiong Yan Liu Xu Sun ODL 83 602 0 26 Feb 2019
Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam Mohammad Emtiyaz Khan Didrik Nielsen Voot Tangkaratt Wu Lin Y. Gal Akash Srivastava ODL 157 271 0 13 Jun 2018
Improving Generalization Performance by Switching from Adam to SGD N. Keskar R. Socher ODL 105 524 0 20 Dec 2017
Noisy Natural Gradient as Variational Inference Guodong Zhang Shengyang Sun David Duvenaud Roger C. Grosse ODL 75 212 0 06 Dec 2017
Vprop: Variational Inference using RMSprop Mohammad Emtiyaz Khan Zuozhu Liu Voot Tangkaratt Y. Gal BDL 55 17 0 04 Dec 2017
Decoupled Weight Decay Regularization I. Loshchilov Frank Hutter OffRL 151 2,158 0 14 Nov 2017
The Marginal Value of Adaptive Gradient Methods in Machine Learning Ashia Wilson Rebecca Roelofs Mitchell Stern Nathan Srebro Benjamin Recht ODL 86 1,032 0 23 May 2017
Stochastic Gradient Descent as Approximate Bayesian Inference Stephan Mandt Matthew D. Hoffman David M. Blei BDL 67 599 0 13 Apr 2017
Conjugate-Computation Variational Inference : Converting Variational Inference in Non-Conjugate Models to Inferences in Conjugate Models Mohammad Emtiyaz Khan Wu Lin BDL 53 137 0 13 Mar 2017
Online Natural Gradient as a Kalman Filter Yann Ollivier 70 68 0 01 Mar 2017
Aggregated Residual Transformations for Deep Neural Networks Saining Xie Ross B. Girshick Piotr Dollár Zhuowen Tu Kaiming He 522 10,351 0 16 Nov 2016
Densely Connected Convolutional Networks Gao Huang Zhuang Liu Laurens van der Maaten Kilian Q. Weinberger PINN 3DV 802 36,892 0 25 Aug 2016
A Kronecker-factored approximate Fisher matrix for convolution layers Roger C. Grosse James Martens ODL 105 264 0 03 Feb 2016
Deep Residual Learning for Image Recognition Kaiming He Xinming Zhang Shaoqing Ren Jian Sun MedIm 2.2K 194,510 0 10 Dec 2015
Optimizing Neural Networks with Kronecker-factored Approximate Curvature James Martens Roger C. Grosse ODL 109 1,024 0 19 Mar 2015
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification Kaiming He Xinming Zhang Shaoqing Ren Jian Sun VLM 347 18,654 0 06 Feb 2015
Adam: A Method for Stochastic Optimization Diederik P. Kingma Jimmy Ba ODL 2.1K 150,364 0 22 Dec 2014
Generating Sequences With Recurrent Neural Networks Alex Graves GAN 169 4,039 0 04 Aug 2013