ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.06767
  4. Cited By
Approximate Fisher Information Matrix to Characterise the Training of
  Deep Neural Networks

Approximate Fisher Information Matrix to Characterise the Training of Deep Neural Networks

16 October 2018
Zhibin Liao
Tom Drummond
Ian Reid
G. Carneiro
ArXivPDFHTML

Papers citing "Approximate Fisher Information Matrix to Characterise the Training of Deep Neural Networks"

23 / 23 papers shown
Title
Three Factors Influencing Minima in SGD
Three Factors Influencing Minima in SGD
Stanislaw Jastrzebski
Zachary Kenton
Devansh Arpit
Nicolas Ballas
Asja Fischer
Yoshua Bengio
Amos Storkey
76
463
0
13 Nov 2017
Don't Decay the Learning Rate, Increase the Batch Size
Don't Decay the Learning Rate, Increase the Batch Size
Samuel L. Smith
Pieter-Jan Kindermans
Chris Ying
Quoc V. Le
ODL
95
994
0
01 Nov 2017
Adaptive Sampling Strategies for Stochastic Optimization
Adaptive Sampling Strategies for Stochastic Optimization
Raghu Bollapragada
R. Byrd
J. Nocedal
44
116
0
30 Oct 2017
A Bayesian Perspective on Generalization and Stochastic Gradient Descent
A Bayesian Perspective on Generalization and Stochastic Gradient Descent
Samuel L. Smith
Quoc V. Le
BDL
61
251
0
17 Oct 2017
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks
Levent Sagun
Utku Evci
V. U. Güney
Yann N. Dauphin
Léon Bottou
54
418
0
14 Jun 2017
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
120
3,675
0
08 Jun 2017
The Loss Surface of Residual Networks: Ensembles and the Role of Batch
  Normalization
The Loss Surface of Residual Networks: Ensembles and the Role of Batch Normalization
Etai Littwin
Lior Wolf
UQCV
111
15
0
08 Nov 2016
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Pratik Chaudhari
A. Choromańska
Stefano Soatto
Yann LeCun
Carlo Baldassi
C. Borgs
J. Chayes
Levent Sagun
R. Zecchina
ODL
94
773
0
06 Nov 2016
Big Batch SGD: Automated Inference using Adaptive Batch Sizes
Big Batch SGD: Automated Inference using Adaptive Batch Sizes
Soham De
A. Yadav
David Jacobs
Tom Goldstein
ODL
125
62
0
18 Oct 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
394
2,934
0
15 Sep 2016
Densely Connected Convolutional Networks
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
Laurens van der Maaten
Kilian Q. Weinberger
PINN
3DV
711
36,708
0
25 Aug 2016
Optimization Methods for Large-Scale Machine Learning
Optimization Methods for Large-Scale Machine Learning
Léon Bottou
Frank E. Curtis
J. Nocedal
211
3,202
0
15 Jun 2016
No bad local minima: Data independent training error guarantees for
  multilayer neural networks
No bad local minima: Data independent training error guarantees for multilayer neural networks
Daniel Soudry
Y. Carmon
155
235
0
26 May 2016
Deep Networks with Stochastic Depth
Deep Networks with Stochastic Depth
Gao Huang
Yu Sun
Zhuang Liu
Daniel Sedra
Kilian Q. Weinberger
187
2,352
0
30 Mar 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.9K
193,426
0
10 Dec 2015
Batch Normalization: Accelerating Deep Network Training by Reducing
  Internal Covariate Shift
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
413
43,234
0
11 Feb 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.4K
149,842
0
22 Dec 2014
Qualitatively characterizing neural network optimization problems
Qualitatively characterizing neural network optimization problems
Ian Goodfellow
Oriol Vinyals
Andrew M. Saxe
ODL
105
522
0
19 Dec 2014
New insights and perspectives on the natural gradient method
New insights and perspectives on the natural gradient method
James Martens
ODL
66
619
0
03 Dec 2014
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
1.4K
39,472
0
01 Sep 2014
Training Neural Networks with Stochastic Hessian-Free Optimization
Training Neural Networks with Stochastic Hessian-Free Optimization
Ryan Kiros
BDL
82
48
0
16 Jan 2013
ADADELTA: An Adaptive Learning Rate Method
ADADELTA: An Adaptive Learning Rate Method
Matthew D. Zeiler
ODL
132
6,623
0
22 Dec 2012
Hybrid Deterministic-Stochastic Methods for Data Fitting
Hybrid Deterministic-Stochastic Methods for Data Fitting
M. Friedlander
Mark Schmidt
171
387
0
13 Apr 2011
1