ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.02677
  4. Cited By
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
v1v2 (latest)

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
    3DH
ArXiv (abs)PDFHTML

Papers citing "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"

50 / 2,054 papers shown
Title
Scalable Smartphone Cluster for Deep Learning
Scalable Smartphone Cluster for Deep Learning
Payam Abdisarabshali
Jaehee Jang
Nicholas Accurso
Seijoon Kim
Filippo Malandra
Moon Sik Jeong
Kwang Choon Kim
Seon Heo
Yoon-Ji Kim
Sungroh Yoon
30
4
0
23 Oct 2021
Feature Learning and Signal Propagation in Deep Neural Networks
Feature Learning and Signal Propagation in Deep Neural Networks
Yizhang Lou
Chris Mingard
Yoonsoo Nam
Soufiane Hayou
MDE
82
18
0
22 Oct 2021
Is High Variance Unavoidable in RL? A Case Study in Continuous Control
Is High Variance Unavoidable in RL? A Case Study in Continuous Control
Johan Bjorck
Carla P. Gomes
Kilian Q. Weinberger
92
24
0
21 Oct 2021
Asynchronous Decentralized Distributed Training of Acoustic Models
Asynchronous Decentralized Distributed Training of Acoustic Models
Xiaodong Cui
Wei Zhang
Abdullah Kayi
Mingrui Liu
Ulrich Finkler
Brian Kingsbury
G. Saon
David S. Kung
61
3
0
21 Oct 2021
Synthesizing Optimal Parallelism Placement and Reduction Strategies on
  Hierarchical Systems for Deep Learning
Synthesizing Optimal Parallelism Placement and Reduction Strategies on Hierarchical Systems for Deep Learning
Ningning Xie
Tamara Norman
Dominik Grewe
Dimitrios Vytiniotis
75
17
0
20 Oct 2021
Layer-wise Adaptive Model Aggregation for Scalable Federated Learning
Layer-wise Adaptive Model Aggregation for Scalable Federated Learning
Sunwoo Lee
Tuo Zhang
Chaoyang He
Salman Avestimehr
FedML
85
51
0
19 Oct 2021
NAS-HPO-Bench-II: A Benchmark Dataset on Joint Optimization of
  Convolutional Neural Network Architecture and Training Hyperparameters
NAS-HPO-Bench-II: A Benchmark Dataset on Joint Optimization of Convolutional Neural Network Architecture and Training Hyperparameters
Yoichi Hirose
Nozomu Yoshinari
Shinichi Shirakawa
63
13
0
19 Oct 2021
Robust Pedestrian Attribute Recognition Using Group Sparsity for
  Occlusion Videos
Robust Pedestrian Attribute Recognition Using Group Sparsity for Occlusion Videos
Geonu Lee
Kimin Yun
Jungchan Cho
85
2
0
17 Oct 2021
MG-GCN: Scalable Multi-GPU GCN Training Framework
MG-GCN: Scalable Multi-GPU GCN Training Framework
M. F. Balin
Kaan Sancak
Ümit V. Çatalyürek
GNN
67
7
0
17 Oct 2021
Trade-offs of Local SGD at Scale: An Empirical Study
Trade-offs of Local SGD at Scale: An Empirical Study
Jose Javier Gonzalez Ortiz
Jonathan Frankle
Michael G. Rabbat
Ari S. Morcos
Nicolas Ballas
FedML
86
18
0
15 Oct 2021
Adaptive Differentially Private Empirical Risk Minimization
Adaptive Differentially Private Empirical Risk Minimization
Xiaoxia Wu
Lingxiao Wang
Irina Cristali
Quanquan Gu
Rebecca Willett
158
6
0
14 Oct 2021
Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same
  Class with Auxiliary Duplicating Permutation Invariant Training
Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training
Kazuki Shimada
Yuichiro Koyama
Shusuke Takahashi
Naoya Takahashi
E. Tsunoo
Yuki Mitsufuji
72
66
0
14 Oct 2021
The Impact of Spatiotemporal Augmentations on Self-Supervised
  Audiovisual Representation Learning
The Impact of Spatiotemporal Augmentations on Self-Supervised Audiovisual Representation Learning
Haider Al-Tahan
Y. Mohsenzadeh
SSLAI4TS
49
0
0
13 Oct 2021
Adaptive Elastic Training for Sparse Deep Learning on Heterogeneous
  Multi-GPU Servers
Adaptive Elastic Training for Sparse Deep Learning on Heterogeneous Multi-GPU Servers
Yujing Ma
Florin Rusu
Kesheng Wu
A. Sim
102
3
0
13 Oct 2021
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
Zhiyuan Li
Tianhao Wang
Sanjeev Arora
MLT
121
105
0
13 Oct 2021
Spatial Data Augmentation with Simulated Room Impulse Responses for
  Sound Event Localization and Detection
Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection
Yuichiro Koyama
Kazuhide Shigemi
Masafumi Takahashi
Kazuki Shimada
Naoya Takahashi
E. Tsunoo
Shusuke Takahashi
Yuki Mitsufuji
69
12
0
13 Oct 2021
Revitalizing CNN Attentions via Transformers in Self-Supervised Visual
  Representation Learning
Revitalizing CNN Attentions via Transformers in Self-Supervised Visual Representation Learning
Chongjian Ge
Youwei Liang
Yibing Song
Jianbo Jiao
Jue Wang
Ping Luo
ViT
74
35
0
11 Oct 2021
ProgFed: Effective, Communication, and Computation Efficient Federated
  Learning by Progressive Training
ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training
Hui-Po Wang
Sebastian U. Stich
Yang He
Mario Fritz
FedMLAI4CE
73
50
0
11 Oct 2021
The Center of Attention: Center-Keypoint Grouping via Attention for
  Multi-Person Pose Estimation
The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person Pose Estimation
Guillem Brasó
Nikita Kister
Laura Leal-Taixé
3DPC
86
40
0
11 Oct 2021
Frequency-aware SGD for Efficient Embedding Learning with Provable
  Benefits
Frequency-aware SGD for Efficient Embedding Learning with Provable Benefits
Yan Li
Dhruv Choudhary
Xiaohan Wei
Baichuan Yuan
Bhargav Bhushanam
T. Zhao
Guanghui Lan
79
6
0
10 Oct 2021
An Empirical Study on Compressed Decentralized Stochastic Gradient
  Algorithms with Overparameterized Models
An Empirical Study on Compressed Decentralized Stochastic Gradient Algorithms with Overparameterized Models
A. Rao
Hoi-To Wai
33
0
0
09 Oct 2021
Pairwise Margin Maximization for Deep Neural Networks
Pairwise Margin Maximization for Deep Neural Networks
Berry Weinstein
Shai Fine
Y. Hel-Or
31
0
0
09 Oct 2021
A Loss Curvature Perspective on Training Instability in Deep Learning
A Loss Curvature Perspective on Training Instability in Deep Learning
Justin Gilmer
Behrooz Ghorbani
Ankush Garg
Sneha Kudugunta
Behnam Neyshabur
David E. Cardoze
George E. Dahl
Zachary Nado
Orhan Firat
ODL
77
37
0
08 Oct 2021
Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Shuo Yang
Le Hou
Xiaodan Song
Qiang Liu
Denny Zhou
150
9
0
08 Oct 2021
A Baseline Framework for Part-level Action Parsing and Action
  Recognition
A Baseline Framework for Part-level Action Parsing and Action Recognition
Xiaodong Chen
Xinchen Liu
Kun Liu
Wu Liu
Tao Mei
85
3
0
07 Oct 2021
Influence-Balanced Loss for Imbalanced Visual Classification
Influence-Balanced Loss for Imbalanced Visual Classification
Seulki Park
Jongin Lim
Younghan Jeon
J. Choi
CVBM
140
136
0
06 Oct 2021
Batch size-invariance for policy optimization
Batch size-invariance for policy optimization
Jacob Hilton
K. Cobbe
John Schulman
120
14
0
01 Oct 2021
Stochastic Contrastive Learning
Stochastic Contrastive Learning
Jason Ramapuram
Dan Busbridge
Xavier Suau
Russ Webb
BDLSSL
105
3
0
01 Oct 2021
Evaluating the fairness of fine-tuning strategies in self-supervised
  learning
Evaluating the fairness of fine-tuning strategies in self-supervised learning
Jason Ramapuram
Dan Busbridge
Russ Webb
69
6
0
01 Oct 2021
Do Self-Supervised and Supervised Methods Learn Similar Visual
  Representations?
Do Self-Supervised and Supervised Methods Learn Similar Visual Representations?
Tom George Grigg
Dan Busbridge
Jason Ramapuram
Russ Webb
SSLDRL
88
27
0
01 Oct 2021
ResNet strikes back: An improved training procedure in timm
ResNet strikes back: An improved training procedure in timm
Ross Wightman
Hugo Touvron
Hervé Jégou
AI4TS
303
500
0
01 Oct 2021
Powerpropagation: A sparsity inducing weight reparameterisation
Powerpropagation: A sparsity inducing weight reparameterisation
Jonathan Richard Schwarz
Siddhant M. Jayakumar
Razvan Pascanu
P. Latham
Yee Whye Teh
194
55
0
01 Oct 2021
Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned
  Meta-Adaptation
Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned Meta-Adaptation
Jay Patravali
Gaurav Mittal
Ye Yu
Fuxin Li
Mei Chen
94
19
0
30 Sep 2021
A Technical Report for ICCV 2021 VIPriors Re-identification Challenge
A Technical Report for ICCV 2021 VIPriors Re-identification Challenge
Cen Liu
Yunbo Peng
Yue-Hsun Lin
72
0
0
30 Sep 2021
IntentVizor: Towards Generic Query Guided Interactive Video
  Summarization
IntentVizor: Towards Generic Query Guided Interactive Video Summarization
Guande Wu
Jianzhe Lin
Claudio T. Silva
87
24
0
30 Sep 2021
Stochastic Training is Not Necessary for Generalization
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
175
76
0
29 Sep 2021
Faster Improvement Rate Population Based Training
Faster Improvement Rate Population Based Training
Valentin Dalibard
Max Jaderberg
67
13
0
28 Sep 2021
TSM: Temporal Shift Module for Efficient and Scalable Video
  Understanding on Edge Device
TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device
Ji Lin
Chuang Gan
Kuan-Chieh Wang
Song Han
100
65
0
27 Sep 2021
Speeding-up One-vs-All Training for Extreme Classification via Smart
  Initialization
Speeding-up One-vs-All Training for Extreme Classification via Smart Initialization
Erik Schultheis
Rohit Babbar
53
2
0
27 Sep 2021
NanoBatch Privacy: Enabling fast Differentially Private learning on the
  IPU
NanoBatch Privacy: Enabling fast Differentially Private learning on the IPU
Edward H. Lee
M. M. Krell
Alexander Tsyplikhin
Victoria Rege
E. Colak
Kristen W. Yeom
FedML
62
0
0
24 Sep 2021
SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
  Relation Extraction
SAIS: Supervising and Augmenting Intermediate Steps for Document-Level Relation Extraction
Yuxin Xiao
Zecheng Zhang
Yuning Mao
Carl Yang
Jiawei Han
RALMAI4TS
86
49
0
24 Sep 2021
DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and
  Transformers
DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers
Changlin Li
Guangrun Wang
Bing Wang
Xiaodan Liang
Zhihui Li
Xiaojun Chang
96
9
0
21 Sep 2021
GoG: Relation-aware Graph-over-Graph Network for Visual Dialog
GoG: Relation-aware Graph-over-Graph Network for Visual Dialog
Feilong Chen
Xiuyi Chen
Fandong Meng
Peng Li
Jie Zhou
145
35
0
17 Sep 2021
Deep Visual Navigation under Partial Observability
Deep Visual Navigation under Partial Observability
Bo Ai
Wei Gao
Vinay
David Hsu
87
11
0
16 Sep 2021
PointManifoldCut: Point-wise Augmentation in the Manifold for Point
  Clouds
PointManifoldCut: Point-wise Augmentation in the Manifold for Point Clouds
Tianfang Zhu
Yue Guan
A. Li
3DPC
84
1
0
15 Sep 2021
Performance-Efficiency Trade-offs in Unsupervised Pre-training for
  Speech Recognition
Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
Felix Wu
Kwangyoun Kim
Jing Pan
Kyu Jeong Han
Kilian Q. Weinberger
Yoav Artzi
60
75
0
14 Sep 2021
DAFNe: A One-Stage Anchor-Free Approach for Oriented Object Detection
DAFNe: A One-Stage Anchor-Free Approach for Oriented Object Detection
Steven Lang
Fabrizio G. Ventola
Kristian Kersting
88
15
0
13 Sep 2021
Single-stream CNN with Learnable Architecture for Multi-source Remote
  Sensing Data
Single-stream CNN with Learnable Architecture for Multi-source Remote Sensing Data
Yi Yang
Daoye Zhu
Tengteng Qu
Qiangyu Wang
Fuhu Ren
Chengqi Cheng
139
23
0
13 Sep 2021
Learning to Ground Visual Objects for Visual Dialog
Learning to Ground Visual Objects for Visual Dialog
Feilong Chen
Xiuyi Chen
Can Xu
Daxin Jiang
OOD
86
18
0
13 Sep 2021
Dynamic Collective Intelligence Learning: Finding Efficient Sparse Model
  via Refined Gradients for Pruned Weights
Dynamic Collective Intelligence Learning: Finding Efficient Sparse Model via Refined Gradients for Pruned Weights
Jang-Hyun Kim
Jayeon Yoo
Yeji Song
Kiyoon Yoo
Nojun Kwak
69
6
0
10 Sep 2021
Previous
123...181920...404142
Next