ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1610.05792
  4. Cited By
Big Batch SGD: Automated Inference using Adaptive Batch Sizes

Big Batch SGD: Automated Inference using Adaptive Batch Sizes

18 October 2016
Soham De
A. Yadav
David Jacobs
Tom Goldstein
    ODL
ArXivPDFHTML

Papers citing "Big Batch SGD: Automated Inference using Adaptive Batch Sizes"

11 / 11 papers shown
Title
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
153
0
0
30 Dec 2024
Adaptive Batch Size for Privately Finding Second-Order Stationary Points
Adaptive Batch Size for Privately Finding Second-Order Stationary Points
Daogao Liu
Kunal Talwar
141
0
0
10 Oct 2024
Flexible numerical optimization with ensmallen
Flexible numerical optimization with ensmallen
Ryan R. Curtin
Marcus Edel
Rahul Prabhu
S. Basak
Zhihao Lou
Conrad Sanderson
18
1
0
09 Mar 2020
History-Gradient Aided Batch Size Adaptation for Variance Reduced
  Algorithms
History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms
Kaiyi Ji
Zhe Wang
Bowen Weng
Yi Zhou
Wei Zhang
Yingbin Liang
ODL
15
5
0
21 Oct 2019
The Effect of Network Width on the Performance of Large-batch Training
The Effect of Network Width on the Performance of Large-batch Training
Lingjiao Chen
Hongyi Wang
Jinman Zhao
Dimitris Papailiopoulos
Paraschos Koutris
16
22
0
11 Jun 2018
AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks
AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks
Aditya Devarakonda
Maxim Naumov
M. Garland
ODL
19
136
0
06 Dec 2017
Block-Cyclic Stochastic Coordinate Descent for Deep Neural Networks
Block-Cyclic Stochastic Coordinate Descent for Deep Neural Networks
Kensuke Nakamura
Stefano Soatto
Byung-Woo Hong
BDL
ODL
40
6
0
20 Nov 2017
Advances in Variational Inference
Advances in Variational Inference
Cheng Zhang
Judith Butepage
Hedvig Kjellström
Stephan Mandt
BDL
38
684
0
15 Nov 2017
Coupling Adaptive Batch Sizes with Learning Rates
Coupling Adaptive Batch Sizes with Learning Rates
Lukas Balles
Javier Romero
Philipp Hennig
ODL
21
110
0
15 Dec 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
Linear Convergence of Gradient and Proximal-Gradient Methods Under the
  Polyak-Łojasiewicz Condition
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Hamed Karimi
J. Nutini
Mark W. Schmidt
139
1,199
0
16 Aug 2016
1