Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.02677
Cited By
v1
v2 (latest)
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"
50 / 2,054 papers shown
Title
Distributed Deep Learning Strategies For Automatic Speech Recognition
Wei Zhang
Xiaodong Cui
Ulrich Finkler
Brian Kingsbury
G. Saon
David S. Kung
M. Picheny
70
29
0
10 Apr 2019
Instance-Level Meta Normalization
Songhao Jia
Ding-Jie Chen
Hwann-Tzong Chen
57
20
0
06 Apr 2019
Iterative Normalization: Beyond Standardization towards Efficient Whitening
Lei Huang
Yi Zhou
Fan Zhu
Li Liu
Ling Shao
87
145
0
06 Apr 2019
Parallelizable Stack Long Short-Term Memory
Shuoyang Ding
Philipp Koehn
56
3
0
06 Apr 2019
Video Classification with Channel-Separated Convolutional Networks
Du Tran
Heng Wang
Lorenzo Torresani
Matt Feiszli
3DV
135
591
0
04 Apr 2019
Model Slicing for Supporting Complex Analytics with Elastic Inference Cost and Resource Constraints
Shaofeng Cai
Gang Chen
Beng Chin Ooi
Jinyang Gao
147
19
0
03 Apr 2019
Exploring Randomly Wired Neural Networks for Image Recognition
Saining Xie
Alexander Kirillov
Ross B. Girshick
Kaiming He
93
365
0
02 Apr 2019
Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses
Yingwei Li
S. Bai
Cihang Xie
Zhenyu A. Liao
Xiaohui Shen
Alan Yuille
AAML
150
51
0
01 Apr 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
332
1,000
0
01 Apr 2019
ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed data
F. Diakogiannis
F. Waldner
P. Caccetta
Chen Wu
SSeg
155
1,359
0
01 Apr 2019
Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.7 seconds
Masafumi Yamazaki
Akihiko Kasagi
Akihiro Tabuchi
Takumi Honda
Masahiro Miwa
Naoto Fukumoto
Tsuguchika Tabaru
Atsushi Ike
Kohta Nakashima
57
88
0
29 Mar 2019
Relation-Aware Graph Attention Network for Visual Question Answering
Linjie Li
Zhe Gan
Yu Cheng
Jingjing Liu
GNN
196
347
0
29 Mar 2019
TensorMask: A Foundation for Dense Object Segmentation
Xinlei Chen
Ross B. Girshick
Kaiming He
Piotr Dollár
ISeg
94
323
0
28 Mar 2019
Feature Intertwiner for Object Detection
Hongyang Li
Bo Dai
Shaoshuai Shi
Wanli Ouyang
Xiaogang Wang
OOD
47
13
0
28 Mar 2019
AutoSlim: Towards One-Shot Architecture Search for Channel Numbers
Jiahui Yu
Thomas Huang
79
56
0
27 Mar 2019
swCaffe: a Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight
Jiarui Fang
Liandeng Li
Haohuan Fu
Jinlei Jiang
Wenlai Zhao
Conghui He
Xin You
Guangwen Yang
32
30
0
16 Mar 2019
Concatenated Feature Pyramid Network for Instance Segmentation
Yongqing Sun
Pranav Shenoy K.P.
J. Shimamura
Atsushi Sagata
ISeg
SSeg
52
8
0
16 Mar 2019
Improving Strong-Scaling of CNN Training by Exploiting Finer-Grained Parallelism
Nikoli Dryden
N. Maruyama
Tom Benson
Tim Moon
M. Snir
B. Van Essen
69
49
0
15 Mar 2019
Inefficiency of K-FAC for Large Batch Size Training
Linjian Ma
Gabe Montague
Jiayu Ye
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
58
24
0
14 Mar 2019
Communication-efficient distributed SGD with Sketching
Nikita Ivkin
D. Rothchild
Enayat Ullah
Vladimir Braverman
Ion Stoica
R. Arora
FedML
93
202
0
12 Mar 2019
Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect
Ang Li
Shuaiwen Leon Song
Jieyang Chen
Jiajia Li
Xu Liu
Nathan R. Tallent
Kevin J. Barker
GNN
112
220
0
11 Mar 2019
Accelerating Minibatch Stochastic Gradient Descent using Typicality Sampling
Xinyu Peng
Li Li
Feiyue Wang
BDL
140
59
0
11 Mar 2019
SSN: Learning Sparse Switchable Normalization via SparsestMax
Wenqi Shao
Jiamin Ren
Jingyu Li
Ruimao Zhang
Yudian Li
Xiaogang Wang
Ping Luo
69
56
0
09 Mar 2019
Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search
Xuzhao Li
Yiming Zhou
Zheng Pan
Jiashi Feng
3DV
72
158
0
09 Mar 2019
SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems
Beidi Chen
Tharun Medini
James Farwell
Sameh Gobriel
Charlie Tai
Anshumali Shrivastava
92
105
0
07 Mar 2019
High-Fidelity Image Generation With Fewer Labels
Mario Lucic
Michael Tschannen
Marvin Ritter
Xiaohua Zhai
Olivier Bachem
Sylvain Gelly
GAN
OOD
127
159
0
06 Mar 2019
Stabilizing the Lottery Ticket Hypothesis
Jonathan Frankle
Gintare Karolina Dziugaite
Daniel M. Roy
Michael Carbin
88
103
0
05 Mar 2019
Complement Objective Training
Hao-Yun Chen
Pei-Hsin Wang
Chun-Hao Liu
Shih-Chieh Chang
Jia Pan
Yutian Chen
Wei Wei
Da-Cheng Juan
AAML
71
49
0
04 Mar 2019
Accelerating Training of Deep Neural Networks with a Standardization Loss
Jasmine Collins
Johannes Ballé
Jonathon Shlens
44
3
0
03 Mar 2019
Speeding up Deep Learning with Transient Servers
Shijian Li
R. Walls
Lijie Xu
Tian Guo
54
12
0
28 Feb 2019
Efficient Contextual Representation Learning Without Softmax Layer
Liunian Harold Li
Patrick H. Chen
Cho-Jui Hsieh
Kai-Wei Chang
59
6
0
28 Feb 2019
Equi-normalization of Neural Networks
Pierre Stock
Benjamin Graham
Rémi Gribonval
Hervé Jégou
ODL
46
18
0
27 Feb 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise
Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
ODL
73
22
0
21 Feb 2019
Augmentation for small object detection
Mate Kisantal
Z. Wojna
Jakub Murawski
Jacek Naruniec
Kyunghyun Cho
ObjD
58
552
0
19 Feb 2019
Optimizing Network Performance for Distributed DNN Training on GPU Clusters: ImageNet/AlexNet Training in 1.5 Minutes
Peng Sun
Wansen Feng
Ruobing Han
Shengen Yan
Yonggang Wen
AI4CE
100
70
0
19 Feb 2019
LocalNorm: Robust Image Classification through Dynamically Regularized Normalization
Bojian Yin
S. Schaafsma
Henk Corporaal
H. Scholte
S. Bohté
30
2
0
18 Feb 2019
Beyond the Memory Wall: A Case for Memory-centric HPC System for Deep Learning
Youngeun Kwon
Minsoo Rhu
72
58
0
18 Feb 2019
MultiGrain: a unified image embedding for classes and instances
Maxim Berman
Hervé Jégou
Andrea Vedaldi
Iasonas Kokkinos
Matthijs Douze
75
111
0
14 Feb 2019
Bag of Freebies for Training Object Detection Neural Networks
Zhi-Li Zhang
Tong He
Hang Zhang
Zhongyue Zhang
Junyuan Xie
Mu Li
VLM
ObjD
85
190
0
11 Feb 2019
Hop: Heterogeneity-Aware Decentralized Training
Qinyi Luo
Jinkun Lin
Youwei Zhuo
Xuehai Qian
72
53
0
04 Feb 2019
Towards Federated Learning at Scale: System Design
Keith Bonawitz
Hubert Eichner
W. Grieskamp
Dzmitry Huba
A. Ingerman
...
H. B. McMahan
Timon Van Overveldt
David Petrou
Daniel Ramage
Jason Roselander
FedML
132
2,682
0
04 Feb 2019
Asymmetric Valleys: Beyond Sharp and Flat Local Minima
Haowei He
Gao Huang
Yang Yuan
ODL
MLT
89
150
0
02 Feb 2019
TF-Replicator: Distributed Machine Learning for Researchers
P. Buchlovsky
David Budden
Dominik Grewe
Chris Jones
John Aslanides
...
Aidan Clark
Sergio Gomez Colmenarejo
Aedan Pope
Fabio Viola
Dan Belov
GNN
OffRL
AI4CE
81
20
0
01 Feb 2019
Compressing Gradient Optimizers via Count-Sketches
Ryan Spring
Anastasios Kyrillidis
Vijai Mohan
Anshumali Shrivastava
58
36
0
01 Feb 2019
Real-world Mapping of Gaze Fixations Using Instance Segmentation for Road Construction Safety Applications
Idris Jeelani
Khashayar Asadi
Hariharan Ramshankar
Kevin K. Han
A. Albert
29
5
0
30 Jan 2019
Semantic Redundancies in Image-Classification Datasets: The 10% You Don't Need
Vighnesh Birodkar
H. Mobahi
Samy Bengio
88
82
0
29 Jan 2019
Quasi-Newton Methods for Machine Learning: Forget the Past, Just Sample
A. Berahas
Majid Jahani
Peter Richtárik
Martin Takávc
107
41
0
28 Jan 2019
Error Feedback Fixes SignSGD and other Gradient Compression Schemes
Sai Praneeth Karimireddy
Quentin Rebjock
Sebastian U. Stich
Martin Jaggi
113
503
0
28 Jan 2019
SGD: General Analysis and Improved Rates
Robert Mansel Gower
Nicolas Loizou
Xun Qian
Alibek Sailanbayev
Egor Shulgin
Peter Richtárik
109
383
0
27 Jan 2019
Heterogeneity-aware Gradient Coding for Straggler Tolerance
Yining Qi
Song Guo
Bin Tang
Ruixuan Li
Chengjie Li
113
20
0
27 Jan 2019
Previous
1
2
3
...
35
36
37
...
40
41
42
Next