ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.02677
  4. Cited By
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
v1v2 (latest)

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
    3DH
ArXiv (abs)PDFHTML

Papers citing "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"

50 / 2,054 papers shown
Title
Training with Mixed-Precision Floating-Point Assignments
Training with Mixed-Precision Floating-Point Assignments
Wonyeol Lee
Rahul Sharma
A. Aiken
MQ
43
3
0
31 Jan 2023
Emergence of Maps in the Memories of Blind Navigation Agents
Emergence of Maps in the Memories of Blind Navigation Agents
Erik Wijmans
Manolis Savva
Irfan Essa
Stefan Lee
Ari S. Morcos
Dhruv Batra
74
33
0
30 Jan 2023
ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning
ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning
Junguang Jiang
Baixu Chen
Junwei Pan
Ximei Wang
Liu Dapeng
Jie Jiang
Mingsheng Long
MoMe
93
23
0
30 Jan 2023
Unlocking Deterministic Robustness Certification on ImageNet
Unlocking Deterministic Robustness Certification on ImageNet
Kaiqin Hu
Andy Zou
Zifan Wang
Klas Leino
Matt Fredrikson
OOD
135
14
0
29 Jan 2023
Pipe-BD: Pipelined Parallel Blockwise Distillation
Pipe-BD: Pipelined Parallel Blockwise Distillation
Hongsun Jang
Jaewon Jung
Jaeyong Song
Joonsang Yu
Youngsok Kim
Jinho Lee
MoEAI4CE
82
2
0
29 Jan 2023
Practical Differentially Private Hyperparameter Tuning with Subsampling
Practical Differentially Private Hyperparameter Tuning with Subsampling
A. Koskela
Tejas D. Kulkarni
114
17
0
27 Jan 2023
Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware
  Communication Compression
Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression
Jaeyong Song
Jinkyu Yim
Jaewon Jung
Hongsun Jang
H. Kim
Youngsok Kim
Jinho Lee
GNN
74
28
0
24 Jan 2023
ScaDLES: Scalable Deep Learning over Streaming data at the Edge
ScaDLES: Scalable Deep Learning over Streaming data at the Edge
S. Tyagi
Martin Swany
52
6
0
21 Jan 2023
ABS: Adaptive Bounded Staleness Converges Faster and Communicates Less
ABS: Adaptive Bounded Staleness Converges Faster and Communicates Less
Qiao Tan
Feng Zhu
Jingjing Zhang
84
0
0
21 Jan 2023
Masked Autoencoding Does Not Help Natural Language Supervision at Scale
Masked Autoencoding Does Not Help Natural Language Supervision at Scale
Floris Weers
Vaishaal Shankar
Angelos Katharopoulos
Yinfei Yang
Tom Gunter
CLIP
54
5
0
19 Jan 2023
Active learning for medical image segmentation with stochastic batches
Active learning for medical image segmentation with stochastic batches
Mélanie Gaillochet
Christian Desrosiers
H. Lombaert
UQCV
89
23
0
18 Jan 2023
ViT-AE++: Improving Vision Transformer Autoencoder for Self-supervised
  Medical Image Representations
ViT-AE++: Improving Vision Transformer Autoencoder for Self-supervised Medical Image Representations
Chinmay Prabhakar
Hongwei Bran Li
Jiancheng Yang
Suprosana Shit
Benedikt Wiestler
Bjoern Menze
ViTMedIm
75
11
0
18 Jan 2023
TAAL: Test-time Augmentation for Active Learning in Medical Image
  Segmentation
TAAL: Test-time Augmentation for Active Learning in Medical Image Segmentation
Mélanie Gaillochet
Christian Desrosiers
H. Lombaert
62
12
0
16 Jan 2023
UATVR: Uncertainty-Adaptive Text-Video Retrieval
UATVR: Uncertainty-Adaptive Text-Video Retrieval
Bo Fang
Wenhao Wu
Chang-rui Liu
Yu Zhou
Yuxin Song
Weiping Wang
Min Yang
Xiang Ji
Jingdong Wang
107
57
0
16 Jan 2023
CMAE-V: Contrastive Masked Autoencoders for Video Action Recognition
CMAE-V: Contrastive Masked Autoencoders for Video Action Recognition
Cheng Lu
Xiaojie Jin
Zhicheng Huang
Qibin Hou
Mingg-Ming Cheng
Jiashi Feng
61
9
0
15 Jan 2023
Towards Spatial Equilibrium Object Detection
Towards Spatial Equilibrium Object Detection
Zhaohui Zheng
Yuming Chen
Qibin Hou
Xiang Li
Ming-Ming Cheng
ObjD
56
0
0
14 Jan 2023
SemPPL: Predicting pseudo-labels for better contrastive representations
SemPPL: Predicting pseudo-labels for better contrastive representations
Matko Bovsnjak
Pierre Harvey Richemond
Nenad Tomašev
Florian Strub
Jacob Walker
Felix Hill
Lars Buesing
Razvan Pascanu
Charles Blundell
Jovana Mitrović
SSLVLM
101
9
0
12 Jan 2023
Learning the Relation between Similarity Loss and Clustering Loss in
  Self-Supervised Learning
Learning the Relation between Similarity Loss and Clustering Loss in Self-Supervised Learning
Jidong Ge
YuXiang Liu
Jie Gui
Lanting Fang
Ming Lin
James T. Kwok
LiGuo Huang
B. Luo
SSL
86
5
0
08 Jan 2023
Infomaxformer: Maximum Entropy Transformer for Long Time-Series
  Forecasting Problem
Infomaxformer: Maximum Entropy Transformer for Long Time-Series Forecasting Problem
Peiwang Tang
Xianchao Zhang
AI4TS
116
6
0
04 Jan 2023
Decentralized Gradient Tracking with Local Steps
Decentralized Gradient Tracking with Local Steps
Yue Liu
Tao R. Lin
Anastasia Koloskova
Sebastian U. Stich
105
41
0
03 Jan 2023
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
Sanghyun Woo
Shoubhik Debnath
Ronghang Hu
Xinlei Chen
Zhuang Liu
In So Kweon
Saining Xie
SyDa
163
822
0
02 Jan 2023
Muse: Text-To-Image Generation via Masked Generative Transformers
Muse: Text-To-Image Generation via Masked Generative Transformers
Huiwen Chang
Han Zhang
Jarred Barber
AJ Maschinot
José Lezama
...
Kevin Patrick Murphy
William T. Freeman
Michael Rubinstein
Yuanzhen Li
Dilip Krishnan
DiffM
278
560
0
02 Jan 2023
Disjoint Masking with Joint Distillation for Efficient Masked Image
  Modeling
Disjoint Masking with Joint Distillation for Efficient Masked Image Modeling
Xin Ma
Chang-Shu Liu
Chunyu Xie
Long Ye
Yafeng Deng
Xiang Ji
137
10
0
31 Dec 2022
Deep set conditioned latent representations for action recognition
Deep set conditioned latent representations for action recognition
Akash Singh
Tom De Schepper
Kevin Mets
P. Hellinckx
José Oramas
Steven Latré
BDL
69
2
0
21 Dec 2022
Input Normalized Stochastic Gradient Descent Training of Deep Neural
  Networks
Input Normalized Stochastic Gradient Descent Training of Deep Neural Networks
S. Atici
Hongyi Pan
Ahmet Enis Cetin
ODL
23
0
0
20 Dec 2022
Scalable Diffusion Models with Transformers
Scalable Diffusion Models with Transformers
William S. Peebles
Saining Xie
GNN
163
2,440
0
19 Dec 2022
Learning useful representations for shifting tasks and distributions
Learning useful representations for shifting tasks and distributions
Jianyu Zhang
Léon Bottou
OOD
76
14
0
14 Dec 2022
Maximal Initial Learning Rates in Deep ReLU Networks
Maximal Initial Learning Rates in Deep ReLU Networks
Gaurav M. Iyer
Boris Hanin
David Rolnick
81
10
0
14 Dec 2022
Jointly Learning Visual and Auditory Speech Representations from Raw
  Data
Jointly Learning Visual and Auditory Speech Representations from Raw Data
A. Haliassos
Pingchuan Ma
Rodrigo Mira
Stavros Petridis
Maja Pantic
SSL
92
49
0
12 Dec 2022
Accelerating Self-Supervised Learning via Efficient Training Strategies
Accelerating Self-Supervised Learning via Efficient Training Strategies
Mustafa Taha Koccyiugit
Timothy M. Hospedales
Hakan Bilen
SSL
66
8
0
11 Dec 2022
VindLU: A Recipe for Effective Video-and-Language Pretraining
VindLU: A Recipe for Effective Video-and-Language Pretraining
Feng Cheng
Xizi Wang
Jie Lei
David J. Crandall
Joey Tianyi Zhou
Gedas Bertasius
VLM
125
81
0
09 Dec 2022
Audiovisual Masked Autoencoders
Audiovisual Masked Autoencoders
Mariana-Iuliana Georgescu
Eduardo Fonseca
Radu Tudor Ionescu
Mario Lucic
Cordelia Schmid
Anurag Arnab
SSL
118
45
0
09 Dec 2022
Benchmarking Self-Supervised Learning on Diverse Pathology Datasets
Benchmarking Self-Supervised Learning on Diverse Pathology Datasets
Mingu Kang
Heon Song
Seonwook Park
Donggeun Yoo
Sérgio Pereira
63
139
0
09 Dec 2022
Training Data Influence Analysis and Estimation: A Survey
Training Data Influence Analysis and Estimation: A Survey
Zayd Hammoudeh
Daniel Lowd
TDI
117
101
0
09 Dec 2022
A New Linear Scaling Rule for Private Adaptive Hyperparameter
  Optimization
A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization
Ashwinee Panda
Xinyu Tang
Saeed Mahloujifar
Vikash Sehwag
Prateek Mittal
126
12
0
08 Dec 2022
An Empirical Study on Multi-Domain Robust Semantic Segmentation
An Empirical Study on Multi-Domain Robust Semantic Segmentation
Yajie Liu
Pu Ge
Qingjie Liu
Shichao Fan
Yunhong Wang
47
2
0
08 Dec 2022
Exploring Stochastic Autoregressive Image Modeling for Visual
  Representation
Exploring Stochastic Autoregressive Image Modeling for Visual Representation
Yu-Hang Qi
Fan Yang
Yousong Zhu
Yufei Liu
Liwei Wu
Rui Zhao
Wei Li
DiffM
57
13
0
03 Dec 2022
Scaling Language-Image Pre-training via Masking
Scaling Language-Image Pre-training via Masking
Yanghao Li
Haoqi Fan
Ronghang Hu
Christoph Feichtenhofer
Kaiming He
CLIPVLM
111
330
0
01 Dec 2022
Hyperbolic Contrastive Learning for Visual Representations beyond
  Objects
Hyperbolic Contrastive Learning for Visual Representations beyond Objects
Songwei Ge
Shlok Kumar Mishra
Simon Kornblith
Chun-Liang Li
David Jacobs
OCLSSL
129
57
0
01 Dec 2022
Disentangling the Mechanisms Behind Implicit Regularization in SGD
Disentangling the Mechanisms Behind Implicit Regularization in SGD
Cheng-i Wang
Simran Kaur
Tanya Marwah
Saurabh Garg
Zachary Chase Lipton
FedML
100
2
0
29 Nov 2022
Graph Convolutional Network for Multi-Target Multi-Camera Vehicle
  Tracking
Graph Convolutional Network for Multi-Target Multi-Camera Vehicle Tracking
Elena Luna
Juan Carlos San Miguel
J. Sanchez
Marcos Escudero-Viñolo
77
5
0
28 Nov 2022
Towards Better Document-level Relation Extraction via Iterative
  Inference
Towards Better Document-level Relation Extraction via Iterative Inference
Li Zhang
Jinsong Su
Yidong Chen
Zhongjian Miao
Zijun Min
Qingguo Hu
X. Shi
71
11
0
26 Nov 2022
Deep Learning Training Procedure Augmentations
Deep Learning Training Procedure Augmentations
Cristian Simionescu
104
1
0
25 Nov 2022
Far3Det: Towards Far-Field 3D Detection
Far3Det: Towards Far-Field 3D Detection
Shubham Gupta
Jeet Kanjani
Mengtian Li
Francesco Ferroni
James Hays
Deva Ramanan
Shu Kong
3DPC
82
10
0
25 Nov 2022
Differentially Private Image Classification from Features
Differentially Private Image Classification from Features
Harsh Mehta
Walid Krichene
Abhradeep Thakurta
Alexey Kurakin
Ashok Cutkosky
113
8
0
24 Nov 2022
Mitigating and Evaluating Static Bias of Action Representations in the
  Background and the Foreground
Mitigating and Evaluating Static Bias of Action Representations in the Background and the Foreground
Haoxin Li
Yuan Liu
Hanwang Zhang
Boyang Li
84
16
0
23 Nov 2022
ActMAD: Activation Matching to Align Distributions for
  Test-Time-Training
ActMAD: Activation Matching to Align Distributions for Test-Time-Training
M. Jehanzeb Mirza
Pol Jané Soneira
W. Lin
Mateusz Koziñski
Horst Possegger
Horst Bischof
VLMTTA
103
29
0
23 Nov 2022
Reason from Context with Self-supervised Learning
Reason from Context with Self-supervised Learning
Xinyu Liu
Ankur Sikarwar
Gabriel Kreiman
Zenglin Shi
Mengmi Zhang
ReLMLRM
94
1
0
23 Nov 2022
N-Gram in Swin Transformers for Efficient Lightweight Image
  Super-Resolution
N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution
Haram Choi
Jeong-Sik Lee
Jihoon Yang
ViT
77
83
0
21 Nov 2022
Unifying Vision-Language Representation Space with Single-tower
  Transformer
Unifying Vision-Language Representation Space with Single-tower Transformer
Jiho Jang
Chaerin Kong
D. Jeon
Seonhoon Kim
Nojun Kwak
113
21
0
21 Nov 2022
Previous
123...101112...404142
Next