ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.02677
  4. Cited By
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
v1v2 (latest)

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
    3DH
ArXiv (abs)PDFHTML

Papers citing "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"

50 / 2,054 papers shown
Title
Linear Mode Connectivity and the Lottery Ticket Hypothesis
Linear Mode Connectivity and the Lottery Ticket Hypothesis
Jonathan Frankle
Gintare Karolina Dziugaite
Daniel M. Roy
Michael Carbin
MoMe
183
630
0
11 Dec 2019
IoU-uniform R-CNN: Breaking Through the Limitations of RPN
IoU-uniform R-CNN: Breaking Through the Limitations of RPN
Li Zhu
Zihao Xie
Liman Liu
B. Tao
Wenbing Tao
ObjD
58
49
0
11 Dec 2019
SpineNet: Learning Scale-Permuted Backbone for Recognition and
  Localization
SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization
Xianzhi Du
Nayeon Lee
Pengchong Jin
Golnaz Ghiasi
Mingxing Tan
Huayu Chen
Quoc V. Le
Xiaodan Song
78
174
0
10 Dec 2019
InfoCNF: An Efficient Conditional Continuous Normalizing Flow with
  Adaptive Solvers
InfoCNF: An Efficient Conditional Continuous Normalizing Flow with Adaptive Solvers
T. Nguyen
Animesh Garg
Richard G. Baraniuk
Anima Anandkumar
TPM
111
9
0
09 Dec 2019
AugMix: A Simple Data Processing Method to Improve Robustness and
  Uncertainty
AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty
Dan Hendrycks
Norman Mu
E. D. Cubuk
Barret Zoph
Justin Gilmer
Balaji Lakshminarayanan
OODUQCV
207
1,314
0
05 Dec 2019
BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed
  Visual Recognition
BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition
Boyan Zhou
Quan Cui
Xiu-Shen Wei
Zhao-Min Chen
326
806
0
05 Dec 2019
A Multigrid Method for Efficiently Training Video Models
A Multigrid Method for Efficiently Training Video Models
Chaoxia Wu
Ross B. Girshick
Kaiming He
Christoph Feichtenhofer
Philipp Krahenbuhl
91
94
0
02 Dec 2019
Face Detection with Feature Pyramids and Landmarks
Face Detection with Feature Pyramids and Landmarks
Samuel W. F. Earp
Pavit Noinongyao
Justin A. Cairns
Ankush Ganguly
CVBM
82
14
0
02 Dec 2019
Gate-Shift Networks for Video Action Recognition
Gate-Shift Networks for Video Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
3DPC
97
155
0
01 Dec 2019
Pythia: AI-assisted Code Completion System
Pythia: AI-assisted Code Completion System
Alexey Svyatkovskiy
Ying Zhao
Shengyu Fu
Neel Sundaresan
79
155
0
29 Nov 2019
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
Humam Alwassel
D. Mahajan
Bruno Korbar
Lorenzo Torresani
Guohao Li
Du Tran
SSL
168
433
0
28 Nov 2019
Decision Propagation Networks for Image Classification
Decision Propagation Networks for Image Classification
Keke Tang
Peng Song
Yuexin Ma
Zhaoquan Gu
Yu Su
Zhihong Tian
Wenping Wang
23
0
0
27 Nov 2019
Stage-based Hyper-parameter Optimization for Deep Learning
Stage-based Hyper-parameter Optimization for Deep Learning
Ahnjae Shin
Dongjin Shin
Sungwoo Cho
Do Yoon Kim
Eunji Jeong
Gyeong-In Yu
Byung-Gon Chun
27
4
0
24 Nov 2019
Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels
Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels
Lu Jiang
Di Huang
Mason Liu
Weilong Yang
NoLa
51
3
0
21 Nov 2019
Filter Response Normalization Layer: Eliminating Batch Dependence in the
  Training of Deep Neural Networks
Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks
Saurabh Singh
Shankar Krishnan
UQCV
108
126
0
21 Nov 2019
Local AdaAlter: Communication-Efficient Stochastic Gradient Descent with
  Adaptive Learning Rates
Local AdaAlter: Communication-Efficient Stochastic Gradient Descent with Adaptive Learning Rates
Cong Xie
Oluwasanmi Koyejo
Indranil Gupta
Yanghua Peng
66
42
0
20 Nov 2019
MetH: A family of high-resolution and variable-shape image challenges
Ferran Parés
Dario Garcia-Gasulla
Harald Servat
Jesús Labarta
Eduard Ayguadé
53
0
0
20 Nov 2019
Auto-Precision Scaling for Distributed Deep Learning
Auto-Precision Scaling for Distributed Deep Learning
Ruobing Han
J. Demmel
Yang You
43
5
0
20 Nov 2019
Layer-wise Adaptive Gradient Sparsification for Distributed Deep
  Learning with Convergence Guarantees
Layer-wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees
Shaoshuai Shi
Zhenheng Tang
Qiang-qiang Wang
Kaiyong Zhao
Xiaowen Chu
65
22
0
20 Nov 2019
Neural Network Pruning with Residual-Connections and Limited-Data
Neural Network Pruning with Residual-Connections and Limited-Data
Jian-Hao Luo
Jianxin Wu
95
118
0
19 Nov 2019
Implicit Regularization and Convergence for Weight Normalization
Implicit Regularization and Convergence for Weight Normalization
Xiaoxia Wu
Yan Sun
Zhaolin Ren
Shanshan Wu
Zhiyuan Li
Suriya Gunasekar
Rachel A. Ward
Qiang Liu
155
21
0
18 Nov 2019
Affine Self Convolution
Affine Self Convolution
Nichita Diaconu
Daniel E. Worrall
35
3
0
18 Nov 2019
Distributed Low Precision Training Without Mixed Precision
Distributed Low Precision Training Without Mixed Precision
Zehua Cheng
Weiyan Wang
Yan Pan
Thomas Lukasiewicz
MQ
52
5
0
18 Nov 2019
Learning with Hierarchical Complement Objective
Learning with Hierarchical Complement Objective
Hao-Yun Chen
Li-Huang Tsai
Shih-Chieh Chang
Jia-Yu Pan
Yu-Ting Chen
Wei Wei
Da-Cheng Juan
VLM
65
5
0
17 Nov 2019
Instance Shadow Detection
Instance Shadow Detection
Tianyu Wang
Xiaowei Hu
Qiong Wang
Pheng-Ann Heng
Chi-Wing Fu
108
74
0
16 Nov 2019
Selective sampling for accelerating training of deep neural networks
Selective sampling for accelerating training of deep neural networks
Berry Weinstein
Shai Fine
Y. Hel-Or
23
3
0
16 Nov 2019
Label-similarity Curriculum Learning
Label-similarity Curriculum Learning
Ürün Dogan
A. Deshmukh
Marcin Machura
Christian Igel
61
21
0
15 Nov 2019
Optimal Mini-Batch Size Selection for Fast Gradient Descent
Optimal Mini-Batch Size Selection for Fast Gradient Descent
M. Perrone
Haidar Khan
Changhoan Kim
Anastasios Kyrillidis
Jerry Quinn
V. Salapura
46
9
0
15 Nov 2019
Understanding the Disharmony between Weight Normalization Family and
  Weight Decay: $ε-$shifted $L_2$ Regularizer
Understanding the Disharmony between Weight Normalization Family and Weight Decay: ε−ε-ε−shifted L2L_2L2​ Regularizer
Li Xiang
Chen Shuo
Xia Yan
Yang Jian
59
2
0
14 Nov 2019
Momentum Contrast for Unsupervised Visual Representation Learning
Momentum Contrast for Unsupervised Visual Representation Learning
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross B. Girshick
SSL
275
12,175
0
13 Nov 2019
HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN
  Training using TensorFlow
HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow
A. A. Awan
Arpan Jain
Quentin G. Anthony
Hari Subramoni
Dhabaleswar K. Panda
MoEAI4CE
55
5
0
12 Nov 2019
DC-S3GD: Delay-Compensated Stale-Synchronous SGD for Large-Scale
  Decentralized Neural Network Training
DC-S3GD: Delay-Compensated Stale-Synchronous SGD for Large-Scale Decentralized Neural Network Training
Alessandro Rigazzi
41
5
0
06 Nov 2019
A Spark ML driven preprocessing approach for deep learning based
  scholarly data applications
A Spark ML driven preprocessing approach for deep learning based scholarly data applications
Samiya Khan
Xiufeng Liu
Mansaf Alam
24
0
0
04 Nov 2019
Progressive Compressed Records: Taking a Byte out of Deep Learning Data
Progressive Compressed Records: Taking a Byte out of Deep Learning Data
Michael Kuchnik
George Amvrosiadis
Virginia Smith
86
9
0
01 Nov 2019
DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion
  Frames
DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames
Erik Wijmans
Abhishek Kadian
Ari S. Morcos
Stefan Lee
Irfan Essa
Devi Parikh
Manolis Savva
Dhruv Batra
97
485
0
01 Nov 2019
Deep convolutional neural networks for multi-scale time-series
  classification and application to disruption prediction in fusion devices
Deep convolutional neural networks for multi-scale time-series classification and application to disruption prediction in fusion devices
R. Churchill
the DIII-D team
AI4CE
36
10
0
31 Oct 2019
Small-GAN: Speeding Up GAN Training Using Core-sets
Small-GAN: Speeding Up GAN Training Using Core-sets
Samarth Sinha
Hang Zhang
Anirudh Goyal
Yoshua Bengio
Hugo Larochelle
Augustus Odena
GAN
99
77
0
29 Oct 2019
E2-Train: Training State-of-the-art CNNs with Over 80% Energy Savings
E2-Train: Training State-of-the-art CNNs with Over 80% Energy Savings
Yue Wang
Ziyu Jiang
Xiaohan Chen
Pengfei Xu
Yang Zhao
Yingyan Lin
Zhangyang Wang
MQ
114
83
0
29 Oct 2019
Asynchronous Decentralized SGD with Quantized and Local Updates
Asynchronous Decentralized SGD with Quantized and Local Updates
Giorgi Nadiradze
Amirmojtaba Sabour
Peter Davies
Shigang Li
Dan Alistarh
79
52
0
27 Oct 2019
Learning an Efficient Network for Large-Scale Hierarchical Object
  Detection with Data Imbalance: 3rd Place Solution to Open Images Challenge
  2019
Learning an Efficient Network for Large-Scale Hierarchical Object Detection with Data Imbalance: 3rd Place Solution to Open Images Challenge 2019
Xingyuan Bu
Junran Peng
Changbao Wang
Cunjun Yu
Guoliang Cao
ObjD
26
2
0
26 Oct 2019
A Simple Dynamic Learning Rate Tuning Algorithm For Automated Training
  of DNNs
A Simple Dynamic Learning Rate Tuning Algorithm For Automated Training of DNNs
Koyel Mukherjee
Alind Khare
Ashish Verma
74
15
0
25 Oct 2019
Gradient Sparification for Asynchronous Distributed Training
Gradient Sparification for Asynchronous Distributed Training
Zijie Yan
FedML
26
1
0
24 Oct 2019
The Practicality of Stochastic Optimization in Imaging Inverse Problems
The Practicality of Stochastic Optimization in Imaging Inverse Problems
Junqi Tang
K. Egiazarian
Mohammad Golbabaee
Mike Davies
70
32
0
22 Oct 2019
Sparsification as a Remedy for Staleness in Distributed Asynchronous SGD
Sparsification as a Remedy for Staleness in Distributed Asynchronous SGD
Rosa Candela
Giulio Franzese
Maurizio Filippone
Pietro Michiardi
91
1
0
21 Oct 2019
A Stochastic Extra-Step Quasi-Newton Method for Nonsmooth Nonconvex
  Optimization
A Stochastic Extra-Step Quasi-Newton Method for Nonsmooth Nonconvex Optimization
Minghan Yang
Andre Milzarek
Zaiwen Wen
Tong Zhang
ODL
96
36
0
21 Oct 2019
Machine Learning Systems for Highly-Distributed and Rapidly-Growing Data
Machine Learning Systems for Highly-Distributed and Rapidly-Growing Data
Kevin Hsieh
SyDaOOD
48
4
0
18 Oct 2019
Mirror Descent View for Neural Network Quantization
Mirror Descent View for Neural Network Quantization
Thalaiyasingam Ajanthan
Kartik Gupta
Philip Torr
Leonid Sigal
P. Dokania
MQ
72
25
0
18 Oct 2019
Federated Learning with Unbiased Gradient Aggregation and Controllable
  Meta Updating
Federated Learning with Unbiased Gradient Aggregation and Controllable Meta Updating
Xin Yao
Tianchi Huang
Ruixiao Zhang
Ruiyu Li
Lifeng Sun
FedML
85
72
0
18 Oct 2019
Improving the convergence of SGD through adaptive batch sizes
Improving the convergence of SGD through adaptive batch sizes
Scott Sievert
Zachary B. Charles
ODL
74
8
0
18 Oct 2019
Instance adaptive adversarial training: Improved accuracy tradeoffs in
  neural nets
Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets
Yogesh Balaji
Tom Goldstein
Judy Hoffman
AAML
205
103
0
17 Oct 2019
Previous
123...313233...404142
Next