ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.02677
  4. Cited By
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
v1v2 (latest)

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
    3DH
ArXiv (abs)PDFHTML

Papers citing "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"

50 / 2,054 papers shown
Title
Random-LTD: Random and Layerwise Token Dropping Brings Efficient
  Training for Large-scale Transformers
Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers
Z. Yao
Xiaoxia Wu
Conglong Li
Connor Holmes
Minjia Zhang
Cheng-rong Li
Yuxiong He
87
12
0
17 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up
VeLO: Training Versatile Learned Optimizers by Scaling Up
Luke Metz
James Harrison
C. Freeman
Amil Merchant
Lucas Beyer
...
Naman Agrawal
Ben Poole
Igor Mordatch
Adam Roberts
Jascha Narain Sohl-Dickstein
138
60
0
17 Nov 2022
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video
  UniFormer
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
Limin Wang
Yu Qiao
ViT
122
113
0
17 Nov 2022
FedSiam-DA: Dual-aggregated Federated Learning via Siamese Network under
  Non-IID Data
FedSiam-DA: Dual-aggregated Federated Learning via Siamese Network under Non-IID Data
Ming Yang
Yanhan Wang
Xin Wang
Zhenyong Zhang
Xiaoming Wu
Peng Cheng
FedML
70
1
0
17 Nov 2022
Exploring State Change Capture of Heterogeneous Backbones @ Ego4D Hands
  and Objects Challenge 2022
Exploring State Change Capture of Heterogeneous Backbones @ Ego4D Hands and Objects Challenge 2022
Yin-Dong Zheng
Guo Chen
Jiahao Wang
Tong Lu
Liming Wang
80
0
0
16 Nov 2022
Masked Reconstruction Contrastive Learning with Information Bottleneck
  Principle
Masked Reconstruction Contrastive Learning with Information Bottleneck Principle
Ziwen Liu
Bonan li
Congying Han
Tiande Guo
Xuecheng Nie
SSL
66
2
0
15 Nov 2022
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion
  and Keyword-to-Caption Augmentation
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation
Yusong Wu
Kai Chen
Tianyu Zhang
Yuchen Hui
Marianna Nezhurina
Taylor Berg-Kirkpatrick
Shlomo Dubnov
CLIP
190
546
0
12 Nov 2022
Exploiting the Partly Scratch-off Lottery Ticket for Quantization-Aware
  Training
Exploiting the Partly Scratch-off Lottery Ticket for Quantization-Aware Training
Mingliang Xu
Gongrui Nan
Yuxin Zhang
Yong Li
Rongrong Ji
MQ
59
3
0
12 Nov 2022
MSLKANet: A Multi-Scale Large Kernel Attention Network for Scene Text
  Removal
MSLKANet: A Multi-Scale Large Kernel Attention Network for Scene Text Removal
Guangtao Lyu
65
0
0
12 Nov 2022
Acoustic Pornography Recognition Using Convolutional Neural Networks and
  Bag of Refinements
Acoustic Pornography Recognition Using Convolutional Neural Networks and Bag of Refinements
Lifeng Zhou
Kaifeng Wei
Yuke Li
Yiya Hao
Weiqiang Yang
Haoqi Zhu
63
1
0
11 Nov 2022
Not Just Plain Text! Fuel Document-Level Relation Extraction with
  Explicit Syntax Refinement and Subsentence Modeling
Not Just Plain Text! Fuel Document-Level Relation Extraction with Explicit Syntax Refinement and Subsentence Modeling
Zhichao Duan
Xiuxing Li
Zhenyu Li
Zhuo Wang
Jianyong Wang
65
7
0
10 Nov 2022
Extending Temporal Data Augmentation for Video Action Recognition
Extending Temporal Data Augmentation for Video Action Recognition
Artjoms Gorpincenko
Michal Mackiewicz
ViT
79
4
0
09 Nov 2022
Soft Augmentation for Image Classification
Soft Augmentation for Image Classification
Yang Liu
Shen Yan
Laura Leal-Taixé
James Hays
Deva Ramanan
74
12
0
09 Nov 2022
When & How to Transfer with Transfer Learning
When & How to Transfer with Transfer Learning
Adrián Tormos
Dario Garcia-Gasulla
Victor Gimenez-Abalos
Sergio Alvarez-Napagao
VLM
67
1
0
08 Nov 2022
On Web-based Visual Corpus Construction for Visual Document
  Understanding
On Web-based Visual Corpus Construction for Visual Document Understanding
Donghyun Kim
Teakgyu Hong
Moonbin Yim
Yoonsik Kim
Geewook Kim
95
4
0
07 Nov 2022
Distilling Representations from GAN Generator via Squeeze and Span
Distilling Representations from GAN Generator via Squeeze and Span
Yu Yang
Xiaotian Cheng
Chang-rui Liu
Hakan Bilen
Xiang Ji
GAN
98
0
0
06 Nov 2022
Local Manifold Augmentation for Multiview Semantic Consistency
Local Manifold Augmentation for Multiview Semantic Consistency
Yu Yang
Wing Yin Cheung
Chang-rui Liu
Xiang Ji
80
1
0
05 Nov 2022
Rethinking the transfer learning for FCN based polyp segmentation in
  colonoscopy
Rethinking the transfer learning for FCN based polyp segmentation in colonoscopy
Yan-mao Wen
Lei Zhang
Xiangli Meng
Xujiong Ye
55
14
0
04 Nov 2022
Unsupervised Visual Representation Learning via Mutual Information
  Regularized Assignment
Unsupervised Visual Representation Learning via Mutual Information Regularized Assignment
Dong Lee
Sung-Ik Choi
Hyunwoo J. Kim
Sae-Young Chung
SSL
97
7
0
04 Nov 2022
Pixel-Wise Contrastive Distillation
Pixel-Wise Contrastive Distillation
Junqiang Huang
Zichao Guo
133
4
0
01 Nov 2022
Adaptive Compression for Communication-Efficient Distributed Training
Adaptive Compression for Communication-Efficient Distributed Training
Maksim Makarenko
Elnur Gasanov
Rustem Islamov
Abdurakhmon Sadiev
Peter Richtárik
123
16
0
31 Oct 2022
Class Interference of Deep Neural Networks
Class Interference of Deep Neural Networks
Dongcui Diao
Hengshuai Yao
Bei Jiang
49
1
0
31 Oct 2022
A simple, efficient and scalable contrastive masked autoencoder for
  learning visual representations
A simple, efficient and scalable contrastive masked autoencoder for learning visual representations
Shlok Kumar Mishra
Joshua Robinson
Huiwen Chang
David Jacobs
Aaron Sarna
Aaron Maschinot
Dilip Krishnan
DiffM
114
31
0
30 Oct 2022
Parameter-Efficient Tuning Makes a Good Classification Head
Parameter-Efficient Tuning Makes a Good Classification Head
Zhuoyi Yang
Ming Ding
Yanhui Guo
Qingsong Lv
Jie Tang
VLM
108
14
0
30 Oct 2022
Auxo: Efficient Federated Learning via Scalable Client Clustering
Auxo: Efficient Federated Learning via Scalable Client Clustering
Jiachen Liu
Fan Lai
Yinwei Dai
Aditya Akella
H. Madhyastha
Mosharaf Chowdhury
120
10
0
29 Oct 2022
DORE: Document Ordered Relation Extraction based on Generative Framework
DORE: Document Ordered Relation Extraction based on Generative Framework
Qipeng Guo
Yuqing Yang
Hang Yan
Xipeng Qiu
Zheng Zhang
127
7
0
28 Oct 2022
Facial Action Unit Detection and Intensity Estimation from
  Self-supervised Representation
Facial Action Unit Detection and Intensity Estimation from Self-supervised Representation
Bowen Ma
Rudong An
Wei Zhang
Yu-qiong Ding
Zeng Zhao
Rongsheng Zhang
Tangjie Lv
Changjie Fan
Zhipeng Hu
CVBM
103
21
0
28 Oct 2022
1st Place Solution of The Robust Vision Challenge 2022 Semantic
  Segmentation Track
1st Place Solution of The Robust Vision Challenge 2022 Semantic Segmentation Track
Junfei Xiao
Zhichao Xu
Shiyi Lan
Zhiding Yu
Alan Yuille
Anima Anandkumar
81
5
0
23 Oct 2022
Delving into Masked Autoencoders for Multi-Label Thorax Disease
  Classification
Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification
Junfei Xiao
Yutong Bai
Alan Yuille
Zongwei Zhou
MedImViT
82
62
0
23 Oct 2022
Rethinking Rotation in Self-Supervised Contrastive Learning: Adaptive
  Positive or Negative Data Augmentation
Rethinking Rotation in Self-Supervised Contrastive Learning: Adaptive Positive or Negative Data Augmentation
Atsuyuki Miyai
Qing Yu
Daiki Ikami
Go Irie
Kiyoharu Aizawa
SSL
91
5
0
23 Oct 2022
Anticipative Feature Fusion Transformer for Multi-Modal Action
  Anticipation
Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation
Zeyun Zhong
David Schneider
Michael Voit
Rainer Stiefelhagen
Jürgen Beyerer
121
47
0
23 Oct 2022
A New Perspective for Understanding Generalization Gap of Deep Neural
  Networks Trained with Large Batch Sizes
A New Perspective for Understanding Generalization Gap of Deep Neural Networks Trained with Large Batch Sizes
O. Oyedotun
Konstantinos Papadopoulos
Djamila Aouada
AI4CE
82
12
0
21 Oct 2022
Self-Supervised Learning via Maximum Entropy Coding
Self-Supervised Learning via Maximum Entropy Coding
Xin Liu
Zhongdao Wang
Yali Li
Shengjin Wang
SSL
134
43
0
20 Oct 2022
A Survey of Computer Vision Technologies In Urban and
  Controlled-environment Agriculture
A Survey of Computer Vision Technologies In Urban and Controlled-environment Agriculture
Jiayun Luo
Boyang Albert Li
Cyril Leung
152
15
0
20 Oct 2022
Large-batch Optimization for Dense Visual Predictions
Large-batch Optimization for Dense Visual Predictions
Zeyue Xue
Jianming Liang
Guanglu Song
Zhuofan Zong
Liang Chen
Yu Liu
Ping Luo
VLM
96
9
0
20 Oct 2022
SSiT: Saliency-guided Self-supervised Image Transformer for Diabetic
  Retinopathy Grading
SSiT: Saliency-guided Self-supervised Image Transformer for Diabetic Retinopathy Grading
Yijin Huang
Junyan Lyu
Pujin Cheng
Roger Tam
Xiaoying Tang
ViTMedIm
100
20
0
20 Oct 2022
Tempo: Accelerating Transformer-Based Model Training through Memory
  Footprint Reduction
Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Muralidhar Andoorveedu
Zhanda Zhu
Bojian Zheng
Gennady Pekhimenko
47
7
0
19 Oct 2022
Self-Supervised Learning Through Efference Copies
Self-Supervised Learning Through Efference Copies
Franz Scherr
Qinghai Guo
Timoleon Moraitis
76
11
0
17 Oct 2022
A Unified Positive-Unlabeled Learning Framework for Document-Level
  Relation Extraction with Different Levels of Labeling
A Unified Positive-Unlabeled Learning Framework for Document-Level Relation Extraction with Different Levels of Labeling
Ye Wang
Xin-Xin Liu
Wen-zhong Hu
Tao Zhang
75
19
0
17 Oct 2022
Accelerating Transfer Learning with Near-Data Computation on Cloud
  Object Stores
Accelerating Transfer Learning with Near-Data Computation on Cloud Object Stores
Arsany Guirguis
Diana Petrescu
Florin Dinu
D. Quoc
Javier Picorel
R. Guerraoui
73
0
0
16 Oct 2022
Bandwidth-efficient distributed neural network architectures with
  application to body sensor networks
Bandwidth-efficient distributed neural network architectures with application to body sensor networks
Thomas Strypsteen
Alexander Bertrand
35
1
0
14 Oct 2022
Vision Transformers provably learn spatial structure
Vision Transformers provably learn spatial structure
Samy Jelassi
Michael E. Sander
Yuan-Fang Li
ViTMLT
100
83
0
13 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
109
51
0
13 Oct 2022
The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in
  Transformers
The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers
Zong-xiao Li
Chong You
Srinadh Bhojanapalli
Daliang Li
A. S. Rawat
...
Kenneth Q Ye
Felix Chern
Felix X. Yu
Ruiqi Guo
Surinder Kumar
MoE
102
97
0
12 Oct 2022
Towards Theoretically Inspired Neural Initialization Optimization
Towards Theoretically Inspired Neural Initialization Optimization
Yibo Yang
Hong Wang
Haobo Yuan
Zhouchen Lin
73
11
0
12 Oct 2022
Decomposed Knowledge Distillation for Class-Incremental Semantic
  Segmentation
Decomposed Knowledge Distillation for Class-Incremental Semantic Segmentation
Donghyeon Baek
Youngmin Oh
Sanghoon Lee
Junghyup Lee
Bumsub Ham
CLLVLM
48
29
0
12 Oct 2022
VER: Scaling On-Policy RL Leads to the Emergence of Navigation in
  Embodied Rearrangement
VER: Scaling On-Policy RL Leads to the Emergence of Navigation in Embodied Rearrangement
Erik Wijmans
Irfan Essa
Dhruv Batra
OffRL
115
14
0
11 Oct 2022
Revisiting adapters with adversarial training
Revisiting adapters with adversarial training
Sylvestre-Alvise Rebuffi
Francesco Croce
Sven Gowal
AAML
62
17
0
10 Oct 2022
Uncertainty Quantification with Pre-trained Language Models: A
  Large-Scale Empirical Analysis
Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis
Yuxin Xiao
Paul Pu Liang
Umang Bhatt
Willie Neiswanger
Ruslan Salakhutdinov
Louis-Philippe Morency
253
98
0
10 Oct 2022
VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature
  Alignment
VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature Alignment
Shraman Pramanick
Li Jing
Sayan Nag
Jiachen Zhu
Hardik Shah
Yann LeCun
Ramalingam Chellappa
82
22
0
09 Oct 2022
Previous
123...111213...404142
Next