Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1901.08276
Cited By
Traditional and Heavy-Tailed Self Regularization in Neural Network Models
24 January 2019
Charles H. Martin
Michael W. Mahoney
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Traditional and Heavy-Tailed Self Regularization in Neural Network Models"
13 / 13 papers shown
Title
Convergence, Sticking and Escape: Stochastic Dynamics Near Critical Points in SGD
Dmitry Dudukalov
Artem Logachov
Vladimir Lotov
Timofei Prasolov
Evgeny Prokopenko
Anton Tarasenko
45
0
0
24 May 2025
An Investigation of the Weight Space to Monitor the Training Progress of Neural Networks
Konstantin Schurholt
Damian Borth
81
3
0
18 Jun 2020
Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks
Charles H. Martin
Michael W. Mahoney
44
56
0
24 Jan 2019
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning
Charles H. Martin
Michael W. Mahoney
AI4CE
109
201
0
02 Oct 2018
Regularization for Deep Learning: A Taxonomy
J. Kukačka
Vladimir Golkov
Daniel Cremers
82
336
0
29 Oct 2017
Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior
Charles H. Martin
Michael W. Mahoney
AI4CE
59
64
0
26 Oct 2017
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
128
3,681
0
08 Jun 2017
Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Elad Hoffer
Itay Hubara
Daniel Soudry
ODL
178
799
0
24 May 2017
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
342
4,629
0
10 Nov 2016
Cleaning large correlation matrices: tools from random matrix theory
J. Bun
J. Bouchaud
M. Potters
72
264
0
25 Oct 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
427
2,941
0
15 Sep 2016
The Loss Surfaces of Multilayer Networks
A. Choromańska
Mikael Henaff
Michaël Mathieu
Gerard Ben Arous
Yann LeCun
ODL
261
1,198
0
30 Nov 2014
Limit Theory for the largest eigenvalues of sample covariance matrices with heavy-tails
Richard A. Davis
Oliver Pfaffel
R. Stelzer
101
37
0
27 Aug 2011
1