Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1710.06451
Cited By
A Bayesian Perspective on Generalization and Stochastic Gradient Descent
17 October 2017
Samuel L. Smith
Quoc V. Le
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Bayesian Perspective on Generalization and Stochastic Gradient Descent"
13 / 13 papers shown
Title
Generalization through variance: how noise shapes inductive biases in diffusion models
John J. Vastola
DiffM
384
3
0
16 Apr 2025
Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks
Pierfrancesco Beneventano
Blake Woodworth
MLT
73
1
0
15 Jan 2025
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Oleg Filatov
Jan Ebert
Jiangtao Wang
Stefan Kesselheim
67
4
0
10 Jan 2025
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang
Depen Morwani
Nikhil Vyas
Jingfeng Wu
Difan Zou
Udaya Ghai
Dean Phillips Foster
Sham Kakade
99
15
0
29 Oct 2024
Continual learning with the neural tangent ensemble
Ari S. Benjamin
Christian Pehle
Kyle Daruwalla
UQCV
96
0
0
30 Aug 2024
Variational Stochastic Gradient Descent for Deep Neural Networks
Haotian Chen
Anna Kuzina
Babak Esmaeili
Jakub M. Tomczak
59
0
0
09 Apr 2024
Information-Theoretic Generalization Bounds for Deep Neural Networks
Haiyun He
Christina Lee Yu
75
5
0
04 Apr 2024
Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach
Ryo Karakida
S. Akaho
S. Amari
FedML
106
142
0
04 Jun 2018
Three Factors Influencing Minima in SGD
Stanislaw Jastrzebski
Zachary Kenton
Devansh Arpit
Nicolas Ballas
Asja Fischer
Yoshua Bengio
Amos Storkey
67
458
0
13 Nov 2017
Don't Decay the Learning Rate, Increase the Batch Size
Samuel L. Smith
Pieter-Jan Kindermans
Chris Ying
Quoc V. Le
ODL
90
990
0
01 Nov 2017
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Pratik Chaudhari
A. Choromańska
Stefano Soatto
Yann LeCun
Carlo Baldassi
C. Borgs
J. Chayes
Levent Sagun
R. Zecchina
ODL
82
769
0
06 Nov 2016
PAC-Bayesian Theory Meets Bayesian Inference
Pascal Germain
Francis R. Bach
Alexandre Lacoste
Simon Lacoste-Julien
51
182
0
27 May 2016
Hybrid Deterministic-Stochastic Methods for Data Fitting
M. Friedlander
Mark Schmidt
116
387
0
13 Apr 2011
1