ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.02677
  4. Cited By
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
v1v2 (latest)

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
    3DH
ArXiv (abs)PDFHTML

Papers citing "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"

50 / 2,054 papers shown
Title
Practical Deep Learning with Bayesian Principles
Practical Deep Learning with Bayesian Principles
Kazuki Osawa
S. Swaroop
Anirudh Jain
Runa Eschenhagen
Richard Turner
Rio Yokota
Mohammad Emtiyaz Khan
BDLUQCV
167
247
0
06 Jun 2019
The Architectural Implications of Facebook's DNN-based Personalized
  Recommendation
The Architectural Implications of Facebook's DNN-based Personalized Recommendation
Udit Gupta
Carole-Jean Wu
Xiaodong Wang
Maxim Naumov
Brandon Reagen
...
Andrey Malevich
Dheevatsa Mudigere
M. Smelyanskiy
Liang Xiong
Xuan Zhang
GNN
118
292
0
06 Jun 2019
How to Initialize your Network? Robust Initialization for WeightNorm &
  ResNets
How to Initialize your Network? Robust Initialization for WeightNorm & ResNets
Devansh Arpit
Victor Campos
Yoshua Bengio
83
59
0
05 Jun 2019
Distributed Training with Heterogeneous Data: Bridging Median- and
  Mean-Based Algorithms
Distributed Training with Heterogeneous Data: Bridging Median- and Mean-Based Algorithms
Xiangyi Chen
Tiancong Chen
Haoran Sun
Zhiwei Steven Wu
Mingyi Hong
FedML
64
74
0
04 Jun 2019
Training Neural Response Selection for Task-Oriented Dialogue Systems
Training Neural Response Selection for Task-Oriented Dialogue Systems
Matthew Henderson
Ivan Vulić
D. Gerz
I. Casanueva
Paweł Budzianowski
Sam Coope
Georgios P. Spithourakis
Tsung-Hsien Wen
N. Mrksic
Pei-hao Su
54
111
0
04 Jun 2019
Robust Learning Under Label Noise With Iterative Noise-Filtering
Robust Learning Under Label Noise With Iterative Noise-Filtering
D. Nguyen
Thi-Phuong-Nhung Ngo
Zhongyu Lou
Michael Klar
Laura Beggel
Thomas Brox
NoLa
68
17
0
01 Jun 2019
PowerSGD: Practical Low-Rank Gradient Compression for Distributed
  Optimization
PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization
Thijs Vogels
Sai Praneeth Karimireddy
Martin Jaggi
105
322
0
31 May 2019
TACNet: Transition-Aware Context Network for Spatio-Temporal Action
  Detection
TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection
Lin Song
Shiwei Zhang
Gang Yu
Hongbin Sun
145
83
0
31 May 2019
Data-driven Algorithm Selection and Parameter Tuning: Two Case studies
  in Optimization and Signal Processing
Data-driven Algorithm Selection and Parameter Tuning: Two Case studies in Optimization and Signal Processing
J. D. Loera
Jamie Haddock
A. Ma
Deanna Needell
29
0
0
31 May 2019
Accelerated Sparsified SGD with Error Feedback
Accelerated Sparsified SGD with Error Feedback
Tomoya Murata
Taiji Suzuki
56
2
0
29 May 2019
Stochastic Gradient Methods with Layer-wise Adaptive Moments for
  Training of Deep Networks
Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks
Boris Ginsburg
P. Castonguay
Oleksii Hrinchuk
Oleksii Kuchaiev
Vitaly Lavrukhin
Ryan Leary
Jason Chun Lok Li
Huyen Nguyen
Yang Zhang
Jonathan M. Cohen
ODL
90
13
0
27 May 2019
On Mixup Training: Improved Calibration and Predictive Uncertainty for
  Deep Neural Networks
On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks
S. Thulasidasan
Gopinath Chennupati
J. Bilmes
Tanmoy Bhattacharya
S. Michalak
UQCV
103
546
0
27 May 2019
Natural Compression for Distributed Deep Learning
Natural Compression for Distributed Deep Learning
Samuel Horváth
Chen-Yu Ho
L. Horvath
Atal Narayan Sahu
Marco Canini
Peter Richtárik
99
151
0
27 May 2019
Classification Accuracy Score for Conditional Generative Models
Classification Accuracy Score for Conditional Generative Models
Suman V. Ravuri
Oriol Vinyals
EGVM
96
235
0
26 May 2019
FasTrCaps: An Integrated Framework for Fast yet Accurate Training of
  Capsule Networks
FasTrCaps: An Integrated Framework for Fast yet Accurate Training of Capsule Networks
Alberto Marchisio
Beatrice Bussolino
Alessio Colucci
Muhammad Abdullah Hanif
Maurizio Martina
Guido Masera
Mohamed Bennai
25
7
0
24 May 2019
Light-Weight RetinaNet for Object Detection
Light-Weight RetinaNet for Object Detection
Yixing Li
Fengbo Ren
ObjD
26
33
0
24 May 2019
Accelerating DNN Training in Wireless Federated Edge Learning Systems
Accelerating DNN Training in Wireless Federated Edge Learning Systems
Jinke Ren
Guanding Yu
Guangyao Ding
FedML
85
175
0
23 May 2019
Online Hyper-parameter Learning for Auto-Augmentation Strategy
Online Hyper-parameter Learning for Auto-Augmentation Strategy
Chen Lin
Minghao Guo
Chuming Li
Yuan Xin
Wei Wu
Dahua Lin
Wanli Ouyang
Junjie Yan
ODL
71
84
0
17 May 2019
Online Normalization for Training Neural Networks
Online Normalization for Training Neural Networks
Vitaliy Chiley
I. Sharapov
Atli Kosson
Urs Koster
R. Reece
S. D. L. Fuente
Vishal Subbiah
Michael James
OnRL
74
55
0
15 May 2019
Scaling Distributed Training of Flood-Filling Networks on HPC
  Infrastructure for Brain Mapping
Scaling Distributed Training of Flood-Filling Networks on HPC Infrastructure for Brain Mapping
Wu Dong
Murat Keçeli
Rafael Vescovi
Hanyu Li
Corey Adams
...
T. Uram
V. Vishwanath
N. Ferrier
B. Kasthuri
P. Littlewood
FedMLAI4CE
40
9
0
13 May 2019
Budgeted Training: Rethinking Deep Neural Network Training Under
  Resource Constraints
Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints
Mengtian Li
Ersin Yumer
Deva Ramanan
72
49
0
12 May 2019
Multi-scale Aggregation R-CNN for 2D Multi-person Pose Estimation
Multi-scale Aggregation R-CNN for 2D Multi-person Pose Estimation
Gyeongsik Moon
Ju Yong Chang
Kyoung Mu Lee
3DH
128
9
0
10 May 2019
On the Linear Speedup Analysis of Communication Efficient Momentum SGD
  for Distributed Non-Convex Optimization
On the Linear Speedup Analysis of Communication Efficient Momentum SGD for Distributed Non-Convex Optimization
Hao Yu
Rong Jin
Sen Yang
FedML
111
387
0
09 May 2019
The Effect of Network Width on Stochastic Gradient Descent and
  Generalization: an Empirical Study
The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
Daniel S. Park
Jascha Narain Sohl-Dickstein
Quoc V. Le
Samuel L. Smith
96
57
0
09 May 2019
AI Enabling Technologies: A Survey
AI Enabling Technologies: A Survey
V. Gadepally
Justin A. Goodwin
J. Kepner
Albert Reuther
Hayley Reynolds
S. Samsi
Jonathan Su
David Martinez
46
25
0
08 May 2019
Few-Shot Adaptive Gaze Estimation
Few-Shot Adaptive Gaze Estimation
Seonwook Park
Shalini De Mello
Pavlo Molchanov
Umar Iqbal
Otmar Hilliges
Jan Kautz
79
201
0
06 May 2019
Fast and Robust Distributed Learning in High Dimension
Fast and Robust Distributed Learning in High Dimension
El-Mahdi El-Mhamdi
R. Guerraoui
Sébastien Rouault
OODFedML
53
16
0
05 May 2019
Accurate Face Detection for High Performance
Accurate Face Detection for High Performance
Faen Zhang
Xinyu Fan
Guo Ai
Jianfei Song
Yongqiang Qin
Jiahong Wu
3DHCVBM
82
36
0
05 May 2019
Unsupervised Pre-Training of Image Features on Non-Curated Data
Unsupervised Pre-Training of Image Features on Non-Curated Data
Mathilde Caron
Piotr Bojanowski
Julien Mairal
Armand Joulin
SSL
85
9
0
03 May 2019
Scaling and Benchmarking Self-Supervised Visual Representation Learning
Scaling and Benchmarking Self-Supervised Visual Representation Learning
Priya Goyal
D. Mahajan
Abhinav Gupta
Ishan Misra
SSL
105
397
0
03 May 2019
Transfer of Adversarial Robustness Between Perturbation Types
Transfer of Adversarial Robustness Between Perturbation Types
Daniel Kang
Yi Sun
Tom B. Brown
Dan Hendrycks
Jacob Steinhardt
AAML
71
49
0
03 May 2019
Large-scale weakly-supervised pre-training for video action recognition
Large-scale weakly-supervised pre-training for video action recognition
Deepti Ghadiyaram
Matt Feiszli
Du Tran
Xueting Yan
Heng Wang
D. Mahajan
64
299
0
02 May 2019
Billion-scale semi-supervised learning for image classification
Billion-scale semi-supervised learning for image classification
I. Z. Yalniz
Hervé Jégou
Kan Chen
Manohar Paluri
D. Mahajan
SSL
146
464
0
02 May 2019
Progressive Differentiable Architecture Search: Bridging the Depth Gap
  between Search and Evaluation
Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation
Xin Chen
Lingxi Xie
Jun Wu
Qi Tian
AI4TSMQ
91
667
0
29 Apr 2019
Collage Inference: Using Coded Redundancy for Low Variance Distributed
  Image Classification
Collage Inference: Using Coded Redundancy for Low Variance Distributed Image Classification
Krishnagiri Narra
Zhifeng Lin
Ganesh Ananthanarayanan
A. Avestimehr
M. Annavaram
VLM
112
6
0
27 Apr 2019
Improved Conditional VRNNs for Video Prediction
Improved Conditional VRNNs for Video Prediction
Lluis Castrejon
Nicolas Ballas
Aaron Courville
VGenDRL
151
164
0
27 Apr 2019
Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the
  Limbo of Resources
Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources
Yanghua Peng
Hang Zhang
Yifei Ma
Tong He
Zhi-Li Zhang
Sheng Zha
Mu Li
50
23
0
26 Apr 2019
Local Relation Networks for Image Recognition
Local Relation Networks for Image Recognition
Han Hu
Zheng Zhang
Zhenda Xie
Stephen Lin
FAtt
129
503
0
25 Apr 2019
Communication trade-offs for synchronized distributed SGD with large
  step size
Communication trade-offs for synchronized distributed SGD with large step size
Kumar Kshitij Patel
Hadrien Hendrikx
FedML
66
27
0
25 Apr 2019
Declarative Recursive Computation on an RDBMS, or, Why You Should Use a
  Database For Distributed Machine Learning
Declarative Recursive Computation on an RDBMS, or, Why You Should Use a Database For Distributed Machine Learning
Dimitrije Jankov
Shangyu Luo
Binhang Yuan
Zhuhua Cai
Jia Zou
C. Jermaine
Zekai J. Gao
71
62
0
25 Apr 2019
Realizing Petabyte Scale Acoustic Modeling
Realizing Petabyte Scale Acoustic Modeling
S. Parthasarathi
Nitin Sivakrishnan
Pranav Ladkat
N. Strom
60
11
0
24 Apr 2019
Analyzing the benefits of communication channels between deep learning
  models
Analyzing the benefits of communication channels between deep learning models
Philippe Lacaille
34
0
0
19 Apr 2019
Knowledge Distillation via Route Constrained Optimization
Knowledge Distillation via Route Constrained Optimization
Xiao Jin
Baoyun Peng
Yichao Wu
Yu Liu
Jiaheng Liu
Ding Liang
Junjie Yan
Xiaolin Hu
90
172
0
19 Apr 2019
Towards VQA Models That Can Read
Towards VQA Models That Can Read
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
EgoV
179
1,257
0
18 Apr 2019
Question Guided Modular Routing Networks for Visual Question Answering
Question Guided Modular Routing Networks for Visual Question Answering
Yanze Wu
Qiang Sun
Jianqi Ma
Bin Li
Yanwei Fu
Yao Peng
Xiangyang Xue
69
1
0
17 Apr 2019
Objects as Points
Objects as Points
Xingyi Zhou
Dequan Wang
Philipp Krahenbuhl
3DPC
138
3,265
0
16 Apr 2019
Detecting Anemia from Retinal Fundus Images
Detecting Anemia from Retinal Fundus Images
A. Mitani
Yun-Hui Liu
Abigail E. Huang
G. Corrado
L. Peng
D. Webster
N. Hammel
A. Varadarajan
26
32
0
12 Apr 2019
Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural
  Networks with Octave Convolution
Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution
Yunpeng Chen
Haoqi Fan
Bing Xu
Zhicheng Yan
Yannis Kalantidis
Marcus Rohrbach
Shuicheng Yan
Jiashi Feng
109
565
0
10 Apr 2019
Depth from Videos in the Wild: Unsupervised Monocular Depth Learning
  from Unknown Cameras
Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras
A. Gordon
Hanhan Li
Rico Jonschkowski
A. Angelova
MDE
102
366
0
10 Apr 2019
CondConv: Conditionally Parameterized Convolutions for Efficient
  Inference
CondConv: Conditionally Parameterized Convolutions for Efficient Inference
Brandon Yang
Gabriel Bender
Quoc V. Le
Jiquan Ngiam
MedIm3DV
100
642
0
10 Apr 2019
Previous
123...343536...404142
Next