ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1710.06451
  4. Cited By
A Bayesian Perspective on Generalization and Stochastic Gradient Descent

A Bayesian Perspective on Generalization and Stochastic Gradient Descent

17 October 2017
Samuel L. Smith
Quoc V. Le
    BDL
ArXivPDFHTML

Papers citing "A Bayesian Perspective on Generalization and Stochastic Gradient Descent"

13 / 13 papers shown
Title
Generalization through variance: how noise shapes inductive biases in diffusion models
Generalization through variance: how noise shapes inductive biases in diffusion models
John J. Vastola
DiffM
384
3
0
16 Apr 2025
Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks
Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks
Pierfrancesco Beneventano
Blake Woodworth
MLT
73
1
0
15 Jan 2025
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Oleg Filatov
Jan Ebert
Jiangtao Wang
Stefan Kesselheim
67
4
0
10 Jan 2025
How Does Critical Batch Size Scale in Pre-training?
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang
Depen Morwani
Nikhil Vyas
Jingfeng Wu
Difan Zou
Udaya Ghai
Dean Phillips Foster
Sham Kakade
99
15
0
29 Oct 2024
Continual learning with the neural tangent ensemble
Continual learning with the neural tangent ensemble
Ari S. Benjamin
Christian Pehle
Kyle Daruwalla
UQCV
96
0
0
30 Aug 2024
Variational Stochastic Gradient Descent for Deep Neural Networks
Variational Stochastic Gradient Descent for Deep Neural Networks
Haotian Chen
Anna Kuzina
Babak Esmaeili
Jakub M. Tomczak
59
0
0
09 Apr 2024
Information-Theoretic Generalization Bounds for Deep Neural Networks
Information-Theoretic Generalization Bounds for Deep Neural Networks
Haiyun He
Christina Lee Yu
75
5
0
04 Apr 2024
Universal Statistics of Fisher Information in Deep Neural Networks: Mean
  Field Approach
Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach
Ryo Karakida
S. Akaho
S. Amari
FedML
106
142
0
04 Jun 2018
Three Factors Influencing Minima in SGD
Three Factors Influencing Minima in SGD
Stanislaw Jastrzebski
Zachary Kenton
Devansh Arpit
Nicolas Ballas
Asja Fischer
Yoshua Bengio
Amos Storkey
67
458
0
13 Nov 2017
Don't Decay the Learning Rate, Increase the Batch Size
Don't Decay the Learning Rate, Increase the Batch Size
Samuel L. Smith
Pieter-Jan Kindermans
Chris Ying
Quoc V. Le
ODL
90
990
0
01 Nov 2017
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Pratik Chaudhari
A. Choromańska
Stefano Soatto
Yann LeCun
Carlo Baldassi
C. Borgs
J. Chayes
Levent Sagun
R. Zecchina
ODL
82
769
0
06 Nov 2016
PAC-Bayesian Theory Meets Bayesian Inference
PAC-Bayesian Theory Meets Bayesian Inference
Pascal Germain
Francis R. Bach
Alexandre Lacoste
Simon Lacoste-Julien
51
182
0
27 May 2016
Hybrid Deterministic-Stochastic Methods for Data Fitting
Hybrid Deterministic-Stochastic Methods for Data Fitting
M. Friedlander
Mark Schmidt
116
387
0
13 Apr 2011
1