ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.02677
  4. Cited By
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
v1v2 (latest)

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
    3DH
ArXiv (abs)PDFHTML

Papers citing "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"

50 / 2,054 papers shown
Title
Dynamic Graph Message Passing Networks
Dynamic Graph Message Passing Networks
Li Zhang
Dan Xu
Anurag Arnab
Philip Torr
GNN
102
138
0
19 Aug 2019
Demystifying Learning Rate Policies for High Accuracy Training of Deep
  Neural Networks
Demystifying Learning Rate Policies for High Accuracy Training of Deep Neural Networks
Yanzhao Wu
Ling Liu
Juhyun Bae
Ka-Ho Chow
Arun Iyengar
C. Pu
Wenqi Wei
Lei Yu
Qi Zhang
58
70
0
18 Aug 2019
Towards Better Generalization: BP-SVRG in Training Deep Neural Networks
Towards Better Generalization: BP-SVRG in Training Deep Neural Networks
Hao Jin
Dachao Lin
Zhihua Zhang
ODL
45
2
0
18 Aug 2019
Regularizing CNN Transfer Learning with Randomised Regression
Regularizing CNN Transfer Learning with Randomised Regression
Yang Zhong
A. Maki
119
13
0
16 Aug 2019
IoU-balanced Loss Functions for Single-stage Object Detection
IoU-balanced Loss Functions for Single-stage Object Detection
Shengkai Wu
Jinrong Yang
Xinggang Wang
Xiaoping Li
ObjD
82
102
0
15 Aug 2019
LIP: Local Importance-based Pooling
LIP: Local Importance-based Pooling
Ziteng Gao
Limin Wang
Gangshan Wu
FAtt
85
96
0
12 Aug 2019
Mix & Match: training convnets with mixed image sizes for improved
  accuracy, speed and scale resiliency
Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency
Elad Hoffer
Berry Weinstein
Itay Hubara
Tal Ben-Nun
Torsten Hoefler
Daniel Soudry
113
20
0
12 Aug 2019
On the Variance of the Adaptive Learning Rate and Beyond
On the Variance of the Adaptive Learning Rate and Beyond
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
ODL
397
1,916
0
08 Aug 2019
LVIS: A Dataset for Large Vocabulary Instance Segmentation
LVIS: A Dataset for Large Vocabulary Instance Segmentation
Agrim Gupta
Piotr Dollár
Ross B. Girshick
ISegVLM
133
1,379
0
08 Aug 2019
Learning a Unified Embedding for Visual Search at Pinterest
Learning a Unified Embedding for Visual Search at Pinterest
Andrew Zhai
Hao-Yu Wu
Eric Tzeng
Dong Huk Park
Charles R. Rosenberg
DML
74
52
0
05 Aug 2019
MoGA: Searching Beyond MobileNetV3
MoGA: Searching Beyond MobileNetV3
Xiangxiang Chu
Bo Zhang
Ruijun Xu
91
42
0
04 Aug 2019
Improving localization-based approaches for breast cancer screening exam
  classification
Improving localization-based approaches for breast cancer screening exam classification
Thibault Févry
Jason Phang
Nan Wu
S. G. Kim
Linda Moy
Kyunghyun Cho
Krzysztof J. Geras
MedIm
43
10
0
01 Aug 2019
Chainer: A Deep Learning Framework for Accelerating the Research Cycle
Chainer: A Deep Learning Framework for Accelerating the Research Cycle
Seiya Tokui
Ryosuke Okuta
Takuya Akiba
Yusuke Niitani
Toru Ogawa
Shunta Saito
Shuji Suzuki
Kota Uenishi
Brian K. Vogel
Hiroyuki Yamazaki Vincent
BDLAI4CE
84
130
0
01 Aug 2019
Accelerating CNN Training by Pruning Activation Gradients
Accelerating CNN Training by Pruning Activation Gradients
Xucheng Ye
Pengcheng Dai
Junyu Luo
Xin Guo
Weisheng Zhao
Jianlei Yang
Yiran Chen
23
2
0
01 Aug 2019
Optimizing Multi-GPU Parallelization Strategies for Deep Learning
  Training
Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training
Saptadeep Pal
Eiman Ebrahimi
A. Zulfiqar
Yaosheng Fu
Victor Zhang
Szymon Migacz
D. Nellans
Puneet Gupta
92
59
0
30 Jul 2019
Deep learning research landscape & roadmap in a nutshell: past, present
  and future -- Towards deep cortical learning
Deep learning research landscape & roadmap in a nutshell: past, present and future -- Towards deep cortical learning
Aras R. Dargazany
40
0
0
30 Jul 2019
Taming Momentum in a Distributed Asynchronous Environment
Taming Momentum in a Distributed Asynchronous Environment
Ido Hakimi
Saar Barkai
Moshe Gabel
Assaf Schuster
93
23
0
26 Jul 2019
DR Loss: Improving Object Detection by Distributional Ranking
DR Loss: Improving Object Detection by Distributional Ranking
Qi Qian
Lei Chen
Hao Li
Rong Jin
46
70
0
23 Jul 2019
BPPSA: Scaling Back-propagation by Parallel Scan Algorithm
BPPSA: Scaling Back-propagation by Parallel Scan Algorithm
Shang Wang
Yifan Bai
Gennady Pekhimenko
60
7
0
23 Jul 2019
Switchable Normalization for Learning-to-Normalize Deep Representation
Switchable Normalization for Learning-to-Normalize Deep Representation
Ping Luo
Ruimao Zhang
Jiamin Ren
Zhanglin Peng
Jingyu Li
129
74
0
22 Jul 2019
Decentralized Deep Learning with Arbitrary Communication Compression
Decentralized Deep Learning with Arbitrary Communication Compression
Anastasia Koloskova
Tao R. Lin
Sebastian U. Stich
Martin Jaggi
FedML
98
236
0
22 Jul 2019
Lookahead Optimizer: k steps forward, 1 step back
Lookahead Optimizer: k steps forward, 1 step back
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
ODL
173
736
0
19 Jul 2019
FOSNet: An End-to-End Trainable Deep Neural Network for Scene
  Recognition
FOSNet: An End-to-End Trainable Deep Neural Network for Scene Recognition
Hongje Seong
Junhyuk Hyun
Euntai Kim
63
52
0
17 Jul 2019
Benchmarking Robustness in Object Detection: Autonomous Driving when
  Winter is Coming
Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming
Claudio Michaelis
Benjamin Mitzkus
Robert Geirhos
E. Rusak
Oliver Bringmann
Alexander S. Ecker
Matthias Bethge
Wieland Brendel
3DPC
107
454
0
17 Jul 2019
Natural Adversarial Examples
Natural Adversarial Examples
Dan Hendrycks
Kevin Zhao
Steven Basart
Jacob Steinhardt
Basel Alomair
OODD
304
1,487
0
16 Jul 2019
Single-bit-per-weight deep convolutional neural networks without
  batch-normalization layers for embedded systems
Single-bit-per-weight deep convolutional neural networks without batch-normalization layers for embedded systems
Mark D Mcdonnell
Hesham Mostafa
Runchun Wang
Andre van Schaik
MQ
47
2
0
16 Jul 2019
Learning Neural Networks with Adaptive Regularization
Learning Neural Networks with Adaptive Regularization
Han Zhao
Yao-Hung Hubert Tsai
Ruslan Salakhutdinov
Geoffrey J. Gordon
52
15
0
14 Jul 2019
A Highly Efficient Distributed Deep Learning System For Automatic Speech
  Recognition
A Highly Efficient Distributed Deep Learning System For Automatic Speech Recognition
Wei Zhang
Xiaodong Cui
Ulrich Finkler
G. Saon
Abdullah Kayi
A. Buyuktosunoglu
Brian Kingsbury
David S. Kung
M. Picheny
49
19
0
10 Jul 2019
Towards Explaining the Regularization Effect of Initial Large Learning
  Rate in Training Neural Networks
Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks
Yuanzhi Li
Colin Wei
Tengyu Ma
90
300
0
10 Jul 2019
Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a
  Noisy Quadratic Model
Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model
Guodong Zhang
Lala Li
Zachary Nado
James Martens
Sushant Sachdeva
George E. Dahl
Christopher J. Shallue
Roger C. Grosse
126
154
0
09 Jul 2019
Etalumis: Bringing Probabilistic Programming to Scientific Simulators at
  Scale
Etalumis: Bringing Probabilistic Programming to Scientific Simulators at Scale
A. G. Baydin
Lei Shao
W. Bhimji
Lukas Heinrich
Lawrence Meadows
...
Philip Torr
Victor W. Lee
Kyle Cranmer
P. Prabhat
Frank Wood
80
58
0
08 Jul 2019
INN: Inflated Neural Networks for IPMN Diagnosis
INN: Inflated Neural Networks for IPMN Diagnosis
Rodney LaLonde
Irene Tanner
K. Nikiforaki
G. Papadakis
Pujan Kandel
C. Bolan
Michael B. Wallace
Ulas Bagci
41
12
0
30 Jun 2019
Deep Gamblers: Learning to Abstain with Portfolio Theory
Deep Gamblers: Learning to Abstain with Portfolio Theory
Liu Ziyin
Zhikang T. Wang
Paul Pu Liang
Ruslan Salakhutdinov
Louis-Philippe Morency
Masahito Ueda
116
114
0
29 Jun 2019
Faster Distributed Deep Net Training: Computation and Communication
  Decoupled Stochastic Gradient Descent
Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent
Shuheng Shen
Linli Xu
Jingchang Liu
Xianfeng Liang
Yifei Cheng
ODLFedML
68
24
0
28 Jun 2019
Fast Training of Sparse Graph Neural Networks on Dense Hardware
Fast Training of Sparse Graph Neural Networks on Dense Hardware
Matej Balog
B. V. Merrienboer
Subhodeep Moitra
Yujia Li
Daniel Tarlow
GNN
58
10
0
27 Jun 2019
Selection via Proxy: Efficient Data Selection for Deep Learning
Selection via Proxy: Efficient Data Selection for Deep Learning
Cody Coleman
Christopher Yeh
Stephen Mussmann
Baharan Mirzasoleiman
Peter Bailis
Percy Liang
J. Leskovec
Matei A. Zaharia
132
351
0
26 Jun 2019
The Adversarial Robustness of Sampling
The Adversarial Robustness of Sampling
Omri Ben-Eliezer
E. Yogev
TTAAAML
61
48
0
26 Jun 2019
Gradient Noise Convolution (GNC): Smoothing Loss Function for
  Distributed Large-Batch SGD
Gradient Noise Convolution (GNC): Smoothing Loss Function for Distributed Large-Batch SGD
Kosuke Haruki
Taiji Suzuki
Yohei Hamakawa
Takeshi Toda
Ryuji Sakai
M. Ozawa
Mitsuhiro Kimura
ODL
61
17
0
26 Jun 2019
The Difficulty of Training Sparse Neural Networks
The Difficulty of Training Sparse Neural Networks
Utku Evci
Fabian Pedregosa
Aidan Gomez
Erich Elsen
81
101
0
25 Jun 2019
Database Meets Deep Learning: Challenges and Opportunities
Database Meets Deep Learning: Challenges and Opportunities
Wei Wang
Meihui Zhang
Gang Chen
H. V. Jagadish
Beng Chin Ooi
K. Tan
89
148
0
21 Jun 2019
Deep Leakage from Gradients
Deep Leakage from Gradients
Ligeng Zhu
Zhijian Liu
Song Han
FedML
114
2,242
0
21 Jun 2019
Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss
Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss
Kaidi Cao
Colin Wei
Adrien Gaidon
Nikos Arechiga
Tengyu Ma
139
1,617
0
18 Jun 2019
Margin Matters: Towards More Discriminative Deep Neural Network
  Embeddings for Speaker Recognition
Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition
Xu Xiang
Shuai Wang
Houjun Huang
Y. Qian
Kai Yu
DRL
77
145
0
18 Jun 2019
MMDetection: Open MMLab Detection Toolbox and Benchmark
MMDetection: Open MMLab Detection Toolbox and Benchmark
Kai-xiang Chen
Jiaqi Wang
Jiangmiao Pang
Yuhang Cao
Yu Xiong
...
Jingdong Wang
Jianping Shi
Wanli Ouyang
Chen Change Loy
Dahua Lin
VOS
240
2,893
0
17 Jun 2019
IMP: Instance Mask Projection for High Accuracy Semantic Segmentation of
  Things
IMP: Instance Mask Projection for High Accuracy Semantic Segmentation of Things
Cheng-Yang Fu
Tamara L. Berg
Alexander C. Berg
ISegVLM
123
18
0
15 Jun 2019
Layered SGD: A Decentralized and Synchronous SGD Algorithm for Scalable
  Deep Neural Network Training
Layered SGD: A Decentralized and Synchronous SGD Algorithm for Scalable Deep Neural Network Training
K. Yu
Thomas Flynn
Shinjae Yoo
N. DÍmperio
OffRL
58
6
0
13 Jun 2019
Four Things Everyone Should Know to Improve Batch Normalization
Four Things Everyone Should Know to Improve Batch Normalization
Cecilia Summers
M. Dinneen
85
52
0
09 Jun 2019
Making Asynchronous Stochastic Gradient Descent Work for Transformers
Making Asynchronous Stochastic Gradient Descent Work for Transformers
Alham Fikri Aji
Kenneth Heafield
68
13
0
08 Jun 2019
Video Modeling with Correlation Networks
Video Modeling with Correlation Networks
Heng Wang
Du Tran
Lorenzo Torresani
Matt Feiszli
116
129
0
07 Jun 2019
Understanding Generalization through Visualizations
Understanding Generalization through Visualizations
Wenjie Huang
Z. Emam
Micah Goldblum
Liam H. Fowl
J. K. Terry
Furong Huang
Tom Goldstein
AI4CE
51
80
0
07 Jun 2019
Previous
123...333435...404142
Next