ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1804.07612
  4. Cited By
Revisiting Small Batch Training for Deep Neural Networks

Revisiting Small Batch Training for Deep Neural Networks

20 April 2018
Dominic Masters
Carlo Luschi
    ODL
ArXivPDFHTML

Papers citing "Revisiting Small Batch Training for Deep Neural Networks"

50 / 167 papers shown
Title
Extrapolation for Large-batch Training in Deep Learning
Extrapolation for Large-batch Training in Deep Learning
Tao R. Lin
Lingjing Kong
Sebastian U. Stich
Martin Jaggi
28
36
0
10 Jun 2020
Momentum-based variance-reduced proximal stochastic gradient method for
  composite nonconvex stochastic optimization
Momentum-based variance-reduced proximal stochastic gradient method for composite nonconvex stochastic optimization
Yangyang Xu
Yibo Xu
12
23
0
31 May 2020
Reliability and Performance Assessment of Federated Learning on Clinical
  Benchmark Data
Reliability and Performance Assessment of Federated Learning on Clinical Benchmark Data
G. Lee
S. Shin
OOD
22
2
0
24 May 2020
Automated Copper Alloy Grain Size Evaluation Using a Deep-learning CNN
Automated Copper Alloy Grain Size Evaluation Using a Deep-learning CNN
George S. Baggs
P. Guerrier
A. Loeb
Jason C. Jones
26
9
0
20 May 2020
Semi-supervised lung nodule retrieval
Semi-supervised lung nodule retrieval
Mark Loyman
H. Greenspan
SSL
11
2
0
04 May 2020
Adaptive Learning of the Optimal Batch Size of SGD
Adaptive Learning of the Optimal Batch Size of SGD
Motasem Alfarra
Slavomir Hanzely
Alyazeed Albasyoni
Guohao Li
Peter Richtárik
29
5
0
03 May 2020
Hot-Starting the Ac Power Flow with Convolutional Neural Networks
Hot-Starting the Ac Power Flow with Convolutional Neural Networks
Liangjie Chen
J. Tate
AI4CE
16
16
0
20 Apr 2020
Visualizing key features in X-ray images of epoxy resins for improved
  material classification using singular value decomposition of deep learning
  features
Visualizing key features in X-ray images of epoxy resins for improved material classification using singular value decomposition of deep learning features
E. Avalos
K. Akagi
Y. Nishiura
11
1
0
16 Apr 2020
Hybrid Attention Networks for Flow and Pressure Forecasting in Water
  Distribution Systems
Hybrid Attention Networks for Flow and Pressure Forecasting in Water Distribution Systems
Ziqing Ma
Shuming Liu
Guancheng Guo
Xipeng Yu
AI4TS
4
4
0
13 Apr 2020
Pipelined Backpropagation at Scale: Training Large Models without
  Batches
Pipelined Backpropagation at Scale: Training Large Models without Batches
Atli Kosson
Vitaliy Chiley
Abhinav Venigalla
Joel Hestness
Urs Koster
35
33
0
25 Mar 2020
Hyper-Parameter Optimization: A Review of Algorithms and Applications
Hyper-Parameter Optimization: A Review of Algorithms and Applications
Tong Yu
Hong Zhu
AAML
21
520
0
12 Mar 2020
Graphcore C2 Card performance for image-based deep learning application:
  A Report
Graphcore C2 Card performance for image-based deep learning application: A Report
Ilyes Kacher
Maxime Portaz
Hicham Randrianarivo
Sylvain Peyronnet
GNN
BDL
VLM
25
12
0
26 Feb 2020
The Break-Even Point on Optimization Trajectories of Deep Neural
  Networks
The Break-Even Point on Optimization Trajectories of Deep Neural Networks
Stanislaw Jastrzebski
Maciej Szymczak
Stanislav Fort
Devansh Arpit
Jacek Tabor
Kyunghyun Cho
Krzysztof J. Geras
50
154
0
21 Feb 2020
Parallel and distributed asynchronous adaptive stochastic gradient
  methods
Parallel and distributed asynchronous adaptive stochastic gradient methods
Yangyang Xu
Yibo Xu
Yonggui Yan
Colin Sutcher-Shepard
Leopold Grinberg
Jiewei Chen
20
2
0
21 Feb 2020
NeuroFabric: Identifying Ideal Topologies for Training A Priori Sparse
  Networks
NeuroFabric: Identifying Ideal Topologies for Training A Priori Sparse Networks
Mihailo Isakov
Michel A. Kinsy
14
1
0
19 Feb 2020
Unique Properties of Flat Minima in Deep Networks
Unique Properties of Flat Minima in Deep Networks
Rotem Mulayoff
T. Michaeli
ODL
24
4
0
11 Feb 2020
Automatic phantom test pattern classification through transfer learning
  with deep neural networks
Automatic phantom test pattern classification through transfer learning with deep neural networks
Rafael B. Fricks
Justin Solomon
Ehsan Samei
MedIm
12
0
0
22 Jan 2020
Optimized Generic Feature Learning for Few-shot Classification across
  Domains
Optimized Generic Feature Learning for Few-shot Classification across Domains
Tonmoy Saikia
Thomas Brox
Cordelia Schmid
VLM
30
48
0
22 Jan 2020
PANTHER: A Programmable Architecture for Neural Network Training
  Harnessing Energy-efficient ReRAM
PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-efficient ReRAM
Aayush Ankit
I. E. Hajj
S. R. Chalamalasetti
S. Agarwal
M. Marinella
M. Foltin
J. Strachan
D. Milojicic
Wen-mei W. Hwu
Kaushik Roy
21
65
0
24 Dec 2019
Fully Automated Multi-Organ Segmentation in Abdominal Magnetic Resonance
  Imaging with Deep Neural Networks
Fully Automated Multi-Organ Segmentation in Abdominal Magnetic Resonance Imaging with Deep Neural Networks
Yuhua Chen
Dan Ruan
Jiayu Xiao
Lixia Wang
Bin Sun
R. Saouaf
Wensha Yang
Debiao Li
Z. Fan
21
58
0
23 Dec 2019
A Regression Framework for Predicting User's Next Location using Call
  Detail Records
A Regression Framework for Predicting User's Next Location using Call Detail Records
M. Mahdizadeh
B. Bahrak
11
7
0
22 Dec 2019
On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep
  Neural Networks
On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks
Umut Simsekli
Mert Gurbuzbalaban
T. H. Nguyen
G. Richard
Levent Sagun
29
55
0
29 Nov 2019
Blink: Fast and Generic Collectives for Distributed ML
Blink: Fast and Generic Collectives for Distributed ML
Guanhua Wang
Shivaram Venkataraman
Amar Phanishayee
J. Thelin
Nikhil R. Devanur
Ion Stoica
VLM
9
136
0
11 Oct 2019
Stochastic Weight Matrix-based Regularization Methods for Deep Neural
  Networks
Stochastic Weight Matrix-based Regularization Methods for Deep Neural Networks
Patrik Reizinger
Bálint Gyires-Tóth
24
2
0
26 Sep 2019
Empirical study towards understanding line search approximations for
  training neural networks
Empirical study towards understanding line search approximations for training neural networks
Younghwan Chae
D. Wilke
27
11
0
15 Sep 2019
Driver Identification via the Steering Wheel
Driver Identification via the Steering Wheel
Bernhard Gahr
Shu Liu
Kevin Koch
F. Barata
André Dahlinger
Benjamin Ryder
E. Fleisch
Felix Wortmann
LLMSV
6
4
0
09 Sep 2019
Minibatch Processing in Spiking Neural Networks
Minibatch Processing in Spiking Neural Networks
D. J. Saunders
Cooper Sigrist
Kenneth Chaney
R. Kozma
H. Siegelmann
6
3
0
05 Sep 2019
Automatic Compiler Based FPGA Accelerator for CNN Training
Automatic Compiler Based FPGA Accelerator for CNN Training
S. Venkataramanaiah
Yufei Ma
Shihui Yin
Eriko Nurvitadhi
A. Dasu
Yu Cao
Jae-sun Seo
26
38
0
15 Aug 2019
Deep Multi-View Learning via Task-Optimal CCA
Deep Multi-View Learning via Task-Optimal CCA
Heather D. Couture
Roland Kwitt
J. S. Marron
M. Troester
C. Perou
Marc Niethammer
29
6
0
17 Jul 2019
Training Neural Response Selection for Task-Oriented Dialogue Systems
Training Neural Response Selection for Task-Oriented Dialogue Systems
Matthew Henderson
Ivan Vulić
D. Gerz
I. Casanueva
Paweł Budzianowski
Sam Coope
Georgios P. Spithourakis
Tsung-Hsien Wen
N. Mrksic
Pei-hao Su
6
110
0
04 Jun 2019
Neural Entropic Estimation: A faster path to mutual information
  estimation
Neural Entropic Estimation: A faster path to mutual information estimation
Chung Chan
Ali Al-Bashabsheh
Hingpang Huang
Michael Lim
D. Tam
Chao Zhao
11
22
0
30 May 2019
One-element Batch Training by Moving Window
One-element Batch Training by Moving Window
Przemysław Spurek
Szymon Knop
Jacek Tabor
Igor T. Podolak
B. Wójcik
VLM
19
0
0
30 May 2019
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for
  Regression Problems
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems
Tianle Cai
Ruiqi Gao
Jikai Hou
Siyu Chen
Dong Wang
Di He
Zhihua Zhang
Liwei Wang
ODL
21
57
0
28 May 2019
Task-Driven Data Verification via Gradient Descent
Task-Driven Data Verification via Gradient Descent
Siavash Golkar
Kyunghyun Cho
14
0
0
14 May 2019
Low-Memory Neural Network Training: A Technical Report
Low-Memory Neural Network Training: A Technical Report
N. Sohoni
Christopher R. Aberger
Megan Leszczynski
Jian Zhang
Christopher Ré
17
99
0
24 Apr 2019
EvalNorm: Estimating Batch Normalization Statistics for Evaluation
EvalNorm: Estimating Batch Normalization Statistics for Evaluation
Saurabh Singh
Abhinav Shrivastava
26
51
0
12 Apr 2019
Parallelizable Stack Long Short-Term Memory
Parallelizable Stack Long Short-Term Memory
Shuoyang Ding
Philipp Koehn
27
3
0
06 Apr 2019
Scalable Deep Learning on Distributed Infrastructures: Challenges,
  Techniques and Tools
Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques and Tools
R. Mayer
Hans-Arno Jacobsen
GNN
27
186
0
27 Mar 2019
Traversing the noise of dynamic mini-batch sub-sampled loss functions: A
  visual guide
Traversing the noise of dynamic mini-batch sub-sampled loss functions: A visual guide
D. Kafka
D. Wilke
26
0
0
20 Mar 2019
Asymmetric Valleys: Beyond Sharp and Flat Local Minima
Asymmetric Valleys: Beyond Sharp and Flat Local Minima
Haowei He
Gao Huang
Yang Yuan
ODL
MLT
25
147
0
02 Feb 2019
Augment your batch: better training with larger batches
Augment your batch: better training with larger batches
Elad Hoffer
Tal Ben-Nun
Itay Hubara
Niv Giladi
Torsten Hoefler
Daniel Soudry
ODL
30
72
0
27 Jan 2019
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural
  Networks
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks
Umut Simsekli
Levent Sagun
Mert Gurbuzbalaban
26
237
0
18 Jan 2019
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU
  Servers
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers
A. Koliousis
Pijika Watcharapichat
Matthias Weidlich
Luo Mai
Paolo Costa
Peter R. Pietzuch
11
69
0
08 Jan 2019
HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array
HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array
Linghao Song
Jiachen Mao
Youwei Zhuo
Xuehai Qian
Hai Helen Li
Yiran Chen
24
97
0
07 Jan 2019
Batch Size Influence on Performance of Graphic and Tensor Processing
  Units during Training and Inference Phases
Batch Size Influence on Performance of Graphic and Tensor Processing Units during Training and Inference Phases
Yuriy Kochura
Yuri G. Gordienko
Vlad Taran
N. Gordienko
Alexandr Rokovyi
Oleg Alienin
S. Stirenko
AI4CE
11
30
0
31 Dec 2018
Pre-Defined Sparse Neural Networks with Hardware Acceleration
Pre-Defined Sparse Neural Networks with Hardware Acceleration
Sourya Dey
Kuan-Wen Huang
P. Beerel
K. Chugg
41
24
0
04 Dec 2018
Deep learning for pedestrians: backpropagation in CNNs
Deep learning for pedestrians: backpropagation in CNNs
L. Boué
3DV
PINN
18
4
0
29 Nov 2018
Kernel-Based Training of Generative Networks
Kernel-Based Training of Generative Networks
Kalliopi Basioti
G. Moustakides
E. Psarakis
GAN
16
2
0
23 Nov 2018
Orthographic Feature Transform for Monocular 3D Object Detection
Orthographic Feature Transform for Monocular 3D Object Detection
Thomas Roddick
Alex Kendall
R. Cipolla
22
364
0
20 Nov 2018
Measuring the Effects of Data Parallelism on Neural Network Training
Measuring the Effects of Data Parallelism on Neural Network Training
Christopher J. Shallue
Jaehoon Lee
J. Antognini
J. Mamou
J. Ketterling
Yao Wang
40
407
0
08 Nov 2018
Previous
1234
Next