Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.07774
Cited By
On the interplay between noise and curvature and its effect on optimization and generalization
18 June 2019
Valentin Thomas
Fabian Pedregosa
B. V. Merrienboer
Pierre-Antoine Mangazol
Yoshua Bengio
Nicolas Le Roux
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the interplay between noise and curvature and its effect on optimization and generalization"
22 / 22 papers shown
Title
An Improved Empirical Fisher Approximation for Natural Gradient Descent
Xiaodong Wu
Wenyi Yu
Chao Zhang
Philip Woodland
36
3
0
10 Jun 2024
Fading memory as inductive bias in residual recurrent networks
I. Dubinin
Felix Effenberger
43
4
0
27 Jul 2023
Correlated Noise in Epoch-Based Stochastic Gradient Descent: Implications for Weight Variances
Marcel Kühn
B. Rosenow
29
3
0
08 Jun 2023
Fast as CHITA: Neural Network Pruning with Combinatorial Optimization
Riade Benbaki
Wenyu Chen
X. Meng
Hussein Hazimeh
Natalia Ponomareva
Zhe Zhao
Rahul Mazumder
21
26
0
28 Feb 2023
On the Lipschitz Constant of Deep Networks and Double Descent
Matteo Gamba
Hossein Azizpour
Mårten Björkman
33
7
0
28 Jan 2023
How to select an objective function using information theory
T. Hodson
T. Over
T. Smith
Lucy M. Marshall
21
1
0
10 Dec 2022
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States
Ziqiao Wang
Yongyi Mao
35
10
0
19 Nov 2022
Noise Injection as a Probe of Deep Learning Dynamics
Noam Levi
I. Bloch
M. Freytsis
T. Volansky
42
2
0
24 Oct 2022
Scalable K-FAC Training for Deep Neural Networks with Distributed Preconditioning
Lin Zhang
S. Shi
Wei Wang
Bo-wen Li
38
10
0
30 Jun 2022
Hybrid quantum ResNet for car classification and its hyperparameter optimization
Asel Sagingalieva
Mohammad Kordzanganeh
Andrii Kurkin
Artem Melnikov
Daniil Kuhmistrov
M. Perelshtein
A. Melnikov
Andrea Skolik
David Von Dollen
61
36
0
10 May 2022
On the Power-Law Hessian Spectrums in Deep Learning
Zeke Xie
Qian-Yuan Tang
Yunfeng Cai
Mingming Sun
P. Li
ODL
42
9
0
31 Jan 2022
A generalization gap estimation for overparameterized models via the Langevin functional variance
Akifumi Okuno
Keisuke Yano
50
1
0
07 Dec 2021
Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization
Alexandre Ramé
Corentin Dancette
Matthieu Cord
OOD
47
206
0
07 Sep 2021
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations, and Anomalous Diffusion
D. Kunin
Javier Sagastuy-Breña
Lauren Gillespie
Eshed Margalit
Hidenori Tanaka
Surya Ganguli
Daniel L. K. Yamins
36
16
0
19 Jul 2021
Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks
S. Shi
Lin Zhang
Bo-wen Li
40
9
0
14 Jul 2021
M-FAC: Efficient Matrix-Free Approximations of Second-Order Information
Elias Frantar
Eldar Kurtic
Dan Alistarh
18
57
0
07 Jul 2021
Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics
Charles H. Martin
Michael W. Mahoney
25
19
0
01 Jun 2021
Deep Learning is Singular, and That's Good
Daniel Murfet
Susan Wei
Biwei Huang
Hui Li
Jesse Gell-Redman
T. Quella
UQCV
24
26
0
22 Oct 2020
When Does Preconditioning Help or Hurt Generalization?
S. Amari
Jimmy Ba
Roger C. Grosse
Xuechen Li
Atsushi Nitanda
Taiji Suzuki
Denny Wu
Ji Xu
36
32
0
18 Jun 2020
Scalable and Practical Natural Gradient for Large-Scale Deep Learning
Kazuki Osawa
Yohei Tsuji
Yuichiro Ueno
Akira Naruse
Chuan-Sheng Foo
Rio Yokota
39
36
0
13 Feb 2020
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates
Jeffrey Negrea
Mahdi Haghifam
Gintare Karolina Dziugaite
Ashish Khisti
Daniel M. Roy
FedML
115
148
0
06 Nov 2019
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
310
2,896
0
15 Sep 2016
1