ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.06356
  4. Cited By
A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers
  Suffice Across Batch Sizes

A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers Suffice Across Batch Sizes

12 February 2021
Zachary Nado
Justin M. Gilmer
Christopher J. Shallue
Rohan Anil
George E. Dahl
    ODL
ArXivPDFHTML

Papers citing "A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers Suffice Across Batch Sizes"

9 / 9 papers shown
Title
The Disharmony between BN and ReLU Causes Gradient Explosion, but is
  Offset by the Correlation between Activations
The Disharmony between BN and ReLU Causes Gradient Explosion, but is Offset by the Correlation between Activations
Inyoung Paik
Jaesik Choi
26
0
0
23 Apr 2023
Provable Acceleration of Nesterov's Accelerated Gradient Method over
  Heavy Ball Method in Training Over-Parameterized Neural Networks
Provable Acceleration of Nesterov's Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks
Xin Liu
Wei Tao
Wei Li
Dazhi Zhan
Jun Wang
Zhisong Pan
ODL
32
1
0
08 Aug 2022
Adaptive Gradient Methods with Local Guarantees
Adaptive Gradient Methods with Local Guarantees
Zhou Lu
Wenhan Xia
Sanjeev Arora
Elad Hazan
ODL
27
9
0
02 Mar 2022
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Xiaoxin He
Fuzhao Xue
Xiaozhe Ren
Yang You
32
14
0
01 Nov 2021
A Loss Curvature Perspective on Training Instability in Deep Learning
A Loss Curvature Perspective on Training Instability in Deep Learning
Justin Gilmer
Behrooz Ghorbani
Ankush Garg
Sneha Kudugunta
Behnam Neyshabur
David E. Cardoze
George E. Dahl
Zachary Nado
Orhan Firat
ODL
36
35
0
08 Oct 2021
Logit Attenuating Weight Normalization
Logit Attenuating Weight Normalization
Aman Gupta
R. Ramanath
Jun Shi
Anika Ramachandran
Sirou Zhou
Mingzhou Zhou
S. Keerthi
45
1
0
12 Aug 2021
Large-Scale Differentially Private BERT
Large-Scale Differentially Private BERT
Rohan Anil
Badih Ghazi
Vineet Gupta
Ravi Kumar
Pasin Manurangsi
36
132
0
03 Aug 2021
On Large-Cohort Training for Federated Learning
On Large-Cohort Training for Federated Learning
Zachary B. Charles
Zachary Garrett
Zhouyuan Huo
Sergei Shmulyian
Virginia Smith
FedML
21
113
0
15 Jun 2021
Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep
  Learning
Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning
Zachary Nado
Neil Band
Mark Collier
Josip Djolonga
Michael W. Dusenberry
...
D. Sculley
Balaji Lakshminarayanan
Jasper Snoek
Y. Gal
Dustin Tran
UQCV
ELM
38
96
0
07 Jun 2021
1