ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.02677
  4. Cited By
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
v1v2 (latest)

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
    3DH
ArXiv (abs)PDFHTML

Papers citing "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"

50 / 2,054 papers shown
Title
Temporal Efficient Training of Spiking Neural Network via Gradient
  Re-weighting
Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting
Shi-Wee Deng
Yuhang Li
Shanghang Zhang
Shi Gu
244
257
0
24 Feb 2022
Auto-scaling Vision Transformers without Training
Auto-scaling Vision Transformers without Training
Wuyang Chen
Wei-Ping Huang
Xianzhi Du
Xiaodan Song
Zhangyang Wang
Denny Zhou
ViT
73
25
0
24 Feb 2022
Movies2Scenes: Using Movie Metadata to Learn Scene Representation
Movies2Scenes: Using Movie Metadata to Learn Scene Representation
Shixing Chen
Chundi Liu
Xiang Hao
Xiaohan Nie
Maxim Arap
Raffay Hamid
67
17
0
22 Feb 2022
Trusted AI in Multi-agent Systems: An Overview of Privacy and Security
  for Distributed Learning
Trusted AI in Multi-agent Systems: An Overview of Privacy and Security for Distributed Learning
Chuan Ma
Jun Li
Kang Wei
Bo Liu
Ming Ding
Long Yuan
Zhu Han
H. Vincent Poor
106
48
0
18 Feb 2022
General Cyclical Training of Neural Networks
General Cyclical Training of Neural Networks
L. Smith
90
6
0
17 Feb 2022
Vision Models Are More Robust And Fair When Pretrained On Uncurated
  Images Without Supervision
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
Priya Goyal
Quentin Duval
Isaac Seessel
Mathilde Caron
Ishan Misra
Levent Sagun
Armand Joulin
Piotr Bojanowski
VLMSSL
132
111
0
16 Feb 2022
Cyclical Focal Loss
Cyclical Focal Loss
L. Smith
83
14
0
16 Feb 2022
ScoreNet: Learning Non-Uniform Attention and Augmentation for
  Transformer-Based Histopathological Image Classification
ScoreNet: Learning Non-Uniform Attention and Augmentation for Transformer-Based Histopathological Image Classification
Thomas Stegmüller
Behzad Bozorgtabar
A. Spahr
Jean-Philippe Thiran
ViTMedIm
110
43
0
15 Feb 2022
Balancing Domain Experts for Long-Tailed Camera-Trap Recognition
Balancing Domain Experts for Long-Tailed Camera-Trap Recognition
Byeongjun Park
Jeongsoo Kim
Seungju Cho
Heeseon Kim
Changick Kim
80
2
0
15 Feb 2022
How Do Vision Transformers Work?
How Do Vision Transformers Work?
Namuk Park
Songkuk Kim
ViT
128
487
0
14 Feb 2022
Towards Disentangling Information Paths with Coded ResNeXt
Towards Disentangling Information Paths with Coded ResNeXt
Apostolos Avranas
Marios Kountouris
FAtt
51
1
0
10 Feb 2022
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
Qing Jin
Jian Ren
Richard Zhuang
Sumant Hanumante
Zhengang Li
Zhiyu Chen
Yanzhi Wang
Kai-Min Yang
Sergey Tulyakov
MQ
97
50
0
10 Feb 2022
Optimal learning rate schedules in high-dimensional non-convex
  optimization problems
Optimal learning rate schedules in high-dimensional non-convex optimization problems
Stéphane dÁscoli
Maria Refinetti
Giulio Biroli
55
7
0
09 Feb 2022
Penalizing Gradient Norm for Efficiently Improving Generalization in
  Deep Learning
Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning
Yang Zhao
Hao Zhang
Xiuyuan Hu
161
122
0
08 Feb 2022
Learning Features with Parameter-Free Layers
Learning Features with Parameter-Free Layers
Dongyoon Han
Y. Yoo
Beomyoung Kim
Byeongho Heo
73
8
0
06 Feb 2022
Harmony: Overcoming the Hurdles of GPU Memory Capacity to Train Massive
  DNN Models on Commodity Servers
Harmony: Overcoming the Hurdles of GPU Memory Capacity to Train Massive DNN Models on Commodity Servers
Youjie Li
Amar Phanishayee
D. Murray
Jakub Tarnawski
Nam Sung Kim
50
22
0
02 Feb 2022
TopoOpt: Co-optimizing Network Topology and Parallelization Strategy for
  Distributed Training Jobs
TopoOpt: Co-optimizing Network Topology and Parallelization Strategy for Distributed Training Jobs
Weiyang Wang
Moein Khazraee
Zhizhen Zhong
M. Ghobadi
Zhihao Jia
Dheevatsa Mudigere
Ying Zhang
A. Kewitsch
131
93
0
01 Feb 2022
Understanding AdamW through Proximal Methods and Scale-Freeness
Understanding AdamW through Proximal Methods and Scale-Freeness
Zhenxun Zhuang
Mingrui Liu
Ashok Cutkosky
Francesco Orabona
91
72
0
31 Jan 2022
ScaLA: Accelerating Adaptation of Pre-Trained Transformer-Based Language
  Models via Efficient Large-Batch Adversarial Noise
ScaLA: Accelerating Adaptation of Pre-Trained Transformer-Based Language Models via Efficient Large-Batch Adversarial Noise
Minjia Zhang
U. Niranjan
Yuxiong He
61
1
0
29 Jan 2022
Toward Training at ImageNet Scale with Differential Privacy
Toward Training at ImageNet Scale with Differential Privacy
Alexey Kurakin
Shuang Song
Steve Chien
Roxana Geambasu
Andreas Terzis
Abhradeep Thakurta
110
104
0
28 Jan 2022
Existence and Estimation of Critical Batch Size for Training Generative
  Adversarial Networks with Two Time-Scale Update Rule
Existence and Estimation of Critical Batch Size for Training Generative Adversarial Networks with Two Time-Scale Update Rule
Naoki Sato
Hideaki Iiduka
EGVM
78
4
0
28 Jan 2022
A Systematic Study of Bias Amplification
A Systematic Study of Bias Amplification
Melissa Hall
Laurens van der Maaten
Laura Gustafson
Maxwell Jones
Aaron B. Adcock
155
77
0
27 Jan 2022
Revisiting RCAN: Improved Training for Image Super-Resolution
Revisiting RCAN: Improved Training for Image Super-Resolution
Zudi Lin
Prateek Garg
Atmadeep Banerjee
Salma Abdel Magid
Deqing Sun
Yulun Zhang
Luc Van Gool
D. Wei
Hanspeter Pfister
SupR
100
59
0
27 Jan 2022
UniFormer: Unifying Convolution and Self-attention for Visual
  Recognition
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
227
383
0
24 Jan 2022
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient
  Long-Term Video Recognition
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
Chao-Yuan Wu
Yanghao Li
K. Mangalam
Haoqi Fan
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
129
201
0
20 Jan 2022
Revisiting Weakly Supervised Pre-Training of Visual Perception Models
Revisiting Weakly Supervised Pre-Training of Visual Perception Models
Mannat Singh
Laura Gustafson
Aaron B. Adcock
Vinicius de Freitas Reis
B. Gedik
Raj Prateek Kosaraju
D. Mahajan
Ross B. Girshick
Piotr Dollár
Laurens van der Maaten
VLM
104
130
0
20 Jan 2022
Near-Optimal Sparse Allreduce for Distributed Deep Learning
Near-Optimal Sparse Allreduce for Distributed Deep Learning
Shigang Li
Torsten Hoefler
62
53
0
19 Jan 2022
TriCoLo: Trimodal Contrastive Loss for Text to Shape Retrieval
TriCoLo: Trimodal Contrastive Loss for Text to Shape Retrieval
Yue Ruan
Han-Hung Lee
Yiming Zhang
Ke Zhang
Angel X. Chang
95
22
0
19 Jan 2022
Pruning-aware Sparse Regularization for Network Pruning
Pruning-aware Sparse Regularization for Network Pruning
Nanfei Jiang
Xu Zhao
Chaoyang Zhao
Yongqi An
Ming Tang
Jinqiao Wang
3DPC
72
13
0
18 Jan 2022
Graph Neural Networks for Cross-Camera Data Association
Graph Neural Networks for Cross-Camera Data Association
Elena Luna
Juan C. Sanmiguel
Jose M. Martínez
Pablo Carballeira
60
21
0
17 Jan 2022
UniFormer: Unified Transformer for Efficient Spatiotemporal
  Representation Learning
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
Kunchang Li
Yali Wang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
143
254
0
12 Jan 2022
Gridiron: A Technique for Augmenting Cloud Workloads with Network
  Bandwidth Requirements
Gridiron: A Technique for Augmenting Cloud Workloads with Network Bandwidth Requirements
N. Kodirov
Shane Bergsma
Syed M. Iqbal
Alan J. Hu
Ivan Beschastnikh
Margo Seltzer
25
0
0
12 Jan 2022
Partial Model Averaging in Federated Learning: Performance Guarantees
  and Benefits
Partial Model Averaging in Federated Learning: Performance Guarantees and Benefits
Sunwoo Lee
Anit Kumar Sahu
Chaoyang He
Salman Avestimehr
FedML
67
19
0
11 Jan 2022
$m^\ast$ of two-dimensional electron gas: a neural canonical
  transformation study
m∗m^\astm∗ of two-dimensional electron gas: a neural canonical transformation study
H.-j. Xie
Linfeng Zhang
Lei Wang
91
8
0
10 Jan 2022
Relieving Long-tailed Instance Segmentation via Pairwise Class Balance
Relieving Long-tailed Instance Segmentation via Pairwise Class Balance
Yin-Yin He
Peizhen Zhang
Xiu-Shen Wei
Xinming Zhang
Jian Sun
ISeg
120
26
0
08 Jan 2022
Contrastive Neighborhood Alignment
Contrastive Neighborhood Alignment
Pengkai Zhu
Zhaowei Cai
Yuanjun Xiong
Zhuowen Tu
Luis Goncalves
Vijay Mahadevan
Stefano Soatto
49
3
0
06 Jan 2022
AutoBalance: Optimized Loss Functions for Imbalanced Data
AutoBalance: Optimized Loss Functions for Imbalanced Data
Mingchen Li
Xuechen Zhang
Christos Thrampoulidis
Jiasi Chen
Samet Oymak
71
68
0
04 Jan 2022
Representation Learning via Consistent Assignment of Views to Clusters
Representation Learning via Consistent Assignment of Views to Clusters
T. Silva
Adín Ramirez Rivera
SSL
52
10
0
31 Dec 2021
Automatic Configuration for Optimal Communication Scheduling in DNN
  Training
Automatic Configuration for Optimal Communication Scheduling in DNN Training
Yiqing Ma
Hao Wang
Yiming Zhang
Kai Chen
35
12
0
27 Dec 2021
DRF Codes: Deep SNR-Robust Feedback Codes
DRF Codes: Deep SNR-Robust Feedback Codes
Mahdi Boloursaz Mashhadi
Deniz Gunduz
A. Perotti
B. Popović
54
11
0
22 Dec 2021
Simple and Effective Balance of Contrastive Losses
Simple and Effective Balance of Contrastive Losses
Arnaud Sors
Rafael Sampaio de Rezende
Sarah Ibrahimi
J. Andreoli
SSL
98
1
0
22 Dec 2021
Learned Queries for Efficient Local Attention
Learned Queries for Efficient Local Attention
Moab Arar
Ariel Shamir
Amit H. Bermano
ViT
143
30
0
21 Dec 2021
Robust and Privacy-Preserving Collaborative Learning: A Comprehensive
  Survey
Robust and Privacy-Preserving Collaborative Learning: A Comprehensive Survey
Shangwei Guo
Xu Zhang
Feiyu Yang
Tianwei Zhang
Yan Gan
Tao Xiang
Yang Liu
FedML
106
9
0
19 Dec 2021
Efficient Strong Scaling Through Burst Parallel Training
Efficient Strong Scaling Through Burst Parallel Training
S. Park
Joshua Fried
Sunghyun Kim
Mohammad Alizadeh
Adam Belay
GNNLRM
64
11
0
19 Dec 2021
Precondition and Effect Reasoning for Action Recognition
Precondition and Effect Reasoning for Action Recognition
Hongsang Yoo
Haopeng Li
Qiuhong Ke
Liangchen Liu
Rui Zhang
CML
92
4
0
19 Dec 2021
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
ViT
203
677
0
16 Dec 2021
Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data
  Augmentation for Long-Tailed Classification
Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification
Xiaohua Chen
Yucan Zhou
Dayan Wu
Wanqian Zhang
Yu Zhou
Yue Liu
Weiping Wang
66
54
0
15 Dec 2021
Improving Hybrid CTC/Attention End-to-end Speech Recognition with
  Pretrained Acoustic and Language Model
Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model
Keqi Deng
Songjun Cao
Yike Zhang
Long Ma
VLM
54
31
0
14 Dec 2021
Server-Side Local Gradient Averaging and Learning Rate Acceleration for
  Scalable Split Learning
Server-Side Local Gradient Averaging and Learning Rate Acceleration for Scalable Split Learning
Shraman Pal
Mansi Uniyal
Jihong Park
Praneeth Vepakomma
Ramesh Raskar
M. Bennis
M. Jeon
Jinho Choi
FedML
82
30
0
11 Dec 2021
Exploring the Equivalence of Siamese Self-Supervised Learning via A
  Unified Gradient Framework
Exploring the Equivalence of Siamese Self-Supervised Learning via A Unified Gradient Framework
Chenxin Tao
Honghui Wang
Xizhou Zhu
Jiahua Dong
S. Song
Gao Huang
Jifeng Dai
SSL
84
61
0
09 Dec 2021
Previous
123...161718...404142
Next