Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.02677
Cited By
v1
v2 (latest)
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"
50 / 2,054 papers shown
Title
Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search
Yi Ding
Xinyu Gong
Junru Wu
Humphrey Shi
Zhicheng Yan
Zhangyang Wang
VGen
88
1
0
09 Dec 2021
Exploring Temporal Granularity in Self-Supervised Video Representation Learning
Rui Qian
Yeqing Li
Liangzhe Yuan
Boqing Gong
Ting Liu
Matthew A. Brown
Serge Belongie
Ming-Hsuan Yang
Hartwig Adam
Huayu Chen
AI4TS
94
6
0
08 Dec 2021
DiPS: Differentiable Policy for Sketching in Recommender Systems
Aritra Ghosh
Saayan Mitra
Andrew Lan
BDL
OffRL
57
2
0
08 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
163
699
0
02 Dec 2021
Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent
Wei Zhang
Mingrui Liu
Yu Feng
Xiaodong Cui
Brian Kingsbury
Yuhai Tu
53
3
0
02 Dec 2021
On Large Batch Training and Sharp Minima: A Fokker-Planck Perspective
Xiaowu Dai
Yuhua Zhu
49
4
0
02 Dec 2021
The Majority Can Help The Minority: Context-rich Minority Oversampling for Long-tailed Classification
Seulki Park
Youngkyu Hong
Byeongho Heo
Sangdoo Yun
J. Choi
131
157
0
01 Dec 2021
DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation
Lukas Hoyer
Dengxin Dai
Luc Van Gool
AI4CE
107
462
0
29 Nov 2021
Impact of classification difficulty on the weight matrices spectra in Deep Learning and application to early-stopping
Xuran Meng
Jianfeng Yao
98
7
0
26 Nov 2021
Learning from Temporal Gradient for Semi-supervised Action Recognition
Junfei Xiao
Longlong Jing
Lin Zhang
Ju He
Qi She
Zongwei Zhou
Alan Yuille
Yingwei Li
89
53
0
25 Nov 2021
Self-Distilled Self-Supervised Representation Learning
Jiho Jang
Seonhoon Kim
Kiyoon Yoo
Chaerin Kong
Jang-Hyun Kim
Nojun Kwak
SSL
98
15
0
25 Nov 2021
MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning
David Junhao Zhang
Kunchang Li
Yali Wang
Yuxiang Chen
Shashwat Chandra
Yu Qiao
Luoqi Liu
Mike Zheng Shou
AI4TS
100
30
0
24 Nov 2021
ViCE: Improving Dense Representation Learning by Superpixelization and Contrasting Cluster Assignment
Robin Karlsson
Tomoki Hayashi
Keisuke Fujii
Alexander Carballo
Kento Ohtani
K. Takeda
SSL
71
4
0
24 Nov 2021
Efficient Video Transformers with Spatial-Temporal Token Selection
Junke Wang
Xitong Yang
Hengduo Li
Li Liu
Zuxuan Wu
Yu-Gang Jiang
ViT
68
67
0
23 Nov 2021
Benchmarking Detection Transfer Learning with Vision Transformers
Yanghao Li
Saining Xie
Xinlei Chen
Piotr Dollar
Kaiming He
Ross B. Girshick
118
170
0
22 Nov 2021
Combined Scaling for Zero-shot Transfer Learning
Hieu H. Pham
Zihang Dai
Golnaz Ghiasi
Kenji Kawaguchi
Hanxiao Liu
...
Yi-Ting Chen
Minh-Thang Luong
Yonghui Wu
Mingxing Tan
Quoc V. Le
VLM
120
202
0
19 Nov 2021
Rethinking Dilated Convolution for Real-time Semantic Segmentation
Roland Gao
SSeg
78
45
0
18 Nov 2021
Evaluating Transformers for Lightweight Action Recognition
Raivo Koot
Markus Hennerbichler
Haiping Lu
ViT
82
8
0
18 Nov 2021
Recurrent Variational Network: A Deep Learning Inverse Problem Solver applied to the task of Accelerated MRI Reconstruction
George Yiasemis
Jan-Jakob Sonke
C. Sánchez
Jonas Teuwen
148
61
0
18 Nov 2021
COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression
Sian Jin
Chengming Zhang
Xintong Jiang
Yunhe Feng
Hui Guan
Guanpeng Li
Shuaiwen Leon Song
Dingwen Tao
46
25
0
18 Nov 2021
Deep neural networks-based denoising models for CT imaging and their efficacy
Prabhat Kc
R. Zeng
M. M. Farhangi
Kyle J. Myers
29
20
0
18 Nov 2021
INTERN: A New Learning Paradigm Towards General Vision
Jing Shao
Siyu Chen
Yangguang Li
Kun Wang
Zhen-fei Yin
...
F. Yu
Junjie Yan
Dahua Lin
Xiaogang Wang
Yu Qiao
110
34
0
16 Nov 2021
CGX: Adaptive System Support for Communication-Efficient Deep Learning
I. Markov
Hamidreza Ramezanikebrya
Dan Alistarh
GNN
82
5
0
16 Nov 2021
Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation
William J. McNally
Kanav Vats
Alexander Wong
J. McPhee
103
68
0
16 Nov 2021
Task allocation for decentralized training in heterogeneous environment
Yongyue Chao
Ming-Ray Liao
Jiaxin Gao
31
0
0
16 Nov 2021
Searching for TrioNet: Combining Convolution with Local and Global Self-Attention
Huaijin Pi
Huiyu Wang
Yingwei Li
Zizhang Li
Alan Yuille
ViT
81
3
0
15 Nov 2021
Domain Generalization on Efficient Acoustic Scene Classification using Residual Normalization
Byeonggeun Kim
Seunghan Yang
Jang-Hyun Kim
Simyung Chang
48
15
0
12 Nov 2021
Catalytic Role Of Noise And Necessity Of Inductive Biases In The Emergence Of Compositional Communication
Lukasz Kuciñski
Tomasz Korbak
P. Kołodziej
Piotr Milo's
111
20
0
11 Nov 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
747
7,885
0
11 Nov 2021
Scaling ASR Improves Zero and Few Shot Learning
Alex Xiao
Weiyi Zheng
Gil Keren
Duc Le
Frank Zhang
Christian Fuegen
Ozlem Kalinli
Yatharth Saraf
Abdel-rahman Mohamed
80
23
0
10 Nov 2021
OSSEM: one-shot speaker adaptive speech enhancement using meta learning
Cheng Yu
Szu-Wei Fu
Tsun-An Hsieh
Yu Tsao
Mirco Ravanelli
VLM
84
4
0
10 Nov 2021
Are Transformers More Robust Than CNNs?
Yutong Bai
Jieru Mei
Alan Yuille
Cihang Xie
ViT
AAML
262
270
0
10 Nov 2021
Data Augmentation Can Improve Robustness
Sylvestre-Alvise Rebuffi
Sven Gowal
D. A. Calian
Florian Stimberg
Olivia Wiles
Timothy A. Mann
AAML
65
293
0
09 Nov 2021
A Survey and Empirical Evaluation of Parallel Deep Learning Frameworks
Daniel Nichols
Siddharth Singh
Shuqing Lin
A. Bhatele
OOD
57
9
0
09 Nov 2021
BlueFog: Make Decentralized Algorithms Practical for Optimization and Deep Learning
Bicheng Ying
Kun Yuan
Hanbin Hu
Yiming Chen
W. Yin
FedML
83
28
0
08 Nov 2021
Finite-Time Consensus Learning for Decentralized Optimization with Nonlinear Gossiping
Junya Chen
Sijia Wang
Lawrence Carin
Chenyang Tao
39
3
0
04 Nov 2021
MixSiam: A Mixture-based Approach to Self-supervised Representation Learning
Xiaoyang Guo
Tianhao Zhao
Yutian Lin
Bo Du
SSL
62
6
0
04 Nov 2021
PatchGame: Learning to Signal Mid-level Patches in Referential Games
Kamal Gupta
Gowthami Somepalli
Anubhav Gupta
Vinoj Jayasundara
Matthias Zwicker
Abhinav Shrivastava
79
4
0
02 Nov 2021
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Xiaoxin He
Fuzhao Xue
Xiaozhe Ren
Yang You
90
15
0
01 Nov 2021
To Talk or to Work: Delay Efficient Federated Learning over Mobile Edge Devices
Pavana Prakash
Jiahao Ding
Maoqiang Wu
Minglei Shu
Rong Yu
Miao Pan
FedML
66
3
0
01 Nov 2021
Learning Debiased and Disentangled Representations for Semantic Segmentation
Sanghyeok Chu
Dongwan Kim
Bohyung Han
70
22
0
31 Oct 2021
Sustainable AI: Environmental Implications, Challenges and Opportunities
Carole-Jean Wu
Ramya Raghavendra
Udit Gupta
Bilge Acun
Newsha Ardalani
...
Maximilian Balandat
Joe Spisak
R. Jain
Michael G. Rabbat
K. Hazelwood
159
418
0
30 Oct 2021
Multi-Task and Multi-Modal Learning for RGB Dynamic Gesture Recognition
Dinghao Fan
Hengjie Lu
Shugong Xu
Shan Cao
67
16
0
29 Oct 2021
OneFlow: Redesign the Distributed Deep Learning Framework from Scratch
Jinhui Yuan
Xinqi Li
Cheng Cheng
Juncheng Liu
Ran Guo
...
Fei Yang
Xiaodong Yi
Chuan Wu
Haoran Zhang
Jie Zhao
62
41
0
28 Oct 2021
GenURL: A General Framework for Unsupervised Representation Learning
Siyuan Li
Zicheng Liu
Z. Zang
Di Wu
Zhiyuan Chen
Stan Z. Li
OOD
3DGS
OffRL
136
9
0
27 Oct 2021
Eigencurve: Optimal Learning Rate Schedule for SGD on Quadratic Objectives with Skewed Hessian Spectrums
Boyao Wang
Haishan Ye
Tong Zhang
116
15
0
27 Oct 2021
Exponential Graph is Provably Efficient for Decentralized Deep Training
Bicheng Ying
Kun Yuan
Yiming Chen
Hanbin Hu
Pan Pan
W. Yin
FedML
115
89
0
26 Oct 2021
Parameter Prediction for Unseen Deep Architectures
Boris Knyazev
M. Drozdzal
Graham W. Taylor
Adriana Romero Soriano
OOD
119
83
0
25 Oct 2021
Exploiting Redundancy: Separable Group Convolutional Networks on Lie Groups
David M. Knigge
David W. Romero
Erik J. Bekkers
96
30
0
25 Oct 2021
ZerO Initialization: Initializing Neural Networks with only Zeros and Ones
Jiawei Zhao
Florian Schäfer
Anima Anandkumar
105
26
0
25 Oct 2021
Previous
1
2
3
...
17
18
19
...
40
41
42
Next