Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.02677
Cited By
v1
v2 (latest)
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"
50 / 2,054 papers shown
Title
Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers
Z. Yao
Xiaoxia Wu
Conglong Li
Connor Holmes
Minjia Zhang
Cheng-rong Li
Yuxiong He
87
12
0
17 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up
Luke Metz
James Harrison
C. Freeman
Amil Merchant
Lucas Beyer
...
Naman Agrawal
Ben Poole
Igor Mordatch
Adam Roberts
Jascha Narain Sohl-Dickstein
138
60
0
17 Nov 2022
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
Limin Wang
Yu Qiao
ViT
122
113
0
17 Nov 2022
FedSiam-DA: Dual-aggregated Federated Learning via Siamese Network under Non-IID Data
Ming Yang
Yanhan Wang
Xin Wang
Zhenyong Zhang
Xiaoming Wu
Peng Cheng
FedML
70
1
0
17 Nov 2022
Exploring State Change Capture of Heterogeneous Backbones @ Ego4D Hands and Objects Challenge 2022
Yin-Dong Zheng
Guo Chen
Jiahao Wang
Tong Lu
Liming Wang
80
0
0
16 Nov 2022
Masked Reconstruction Contrastive Learning with Information Bottleneck Principle
Ziwen Liu
Bonan li
Congying Han
Tiande Guo
Xuecheng Nie
SSL
66
2
0
15 Nov 2022
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation
Yusong Wu
Kai Chen
Tianyu Zhang
Yuchen Hui
Marianna Nezhurina
Taylor Berg-Kirkpatrick
Shlomo Dubnov
CLIP
190
546
0
12 Nov 2022
Exploiting the Partly Scratch-off Lottery Ticket for Quantization-Aware Training
Mingliang Xu
Gongrui Nan
Yuxin Zhang
Yong Li
Rongrong Ji
MQ
59
3
0
12 Nov 2022
MSLKANet: A Multi-Scale Large Kernel Attention Network for Scene Text Removal
Guangtao Lyu
65
0
0
12 Nov 2022
Acoustic Pornography Recognition Using Convolutional Neural Networks and Bag of Refinements
Lifeng Zhou
Kaifeng Wei
Yuke Li
Yiya Hao
Weiqiang Yang
Haoqi Zhu
63
1
0
11 Nov 2022
Not Just Plain Text! Fuel Document-Level Relation Extraction with Explicit Syntax Refinement and Subsentence Modeling
Zhichao Duan
Xiuxing Li
Zhenyu Li
Zhuo Wang
Jianyong Wang
65
7
0
10 Nov 2022
Extending Temporal Data Augmentation for Video Action Recognition
Artjoms Gorpincenko
Michal Mackiewicz
ViT
79
4
0
09 Nov 2022
Soft Augmentation for Image Classification
Yang Liu
Shen Yan
Laura Leal-Taixé
James Hays
Deva Ramanan
74
12
0
09 Nov 2022
When & How to Transfer with Transfer Learning
Adrián Tormos
Dario Garcia-Gasulla
Victor Gimenez-Abalos
Sergio Alvarez-Napagao
VLM
67
1
0
08 Nov 2022
On Web-based Visual Corpus Construction for Visual Document Understanding
Donghyun Kim
Teakgyu Hong
Moonbin Yim
Yoonsik Kim
Geewook Kim
95
4
0
07 Nov 2022
Distilling Representations from GAN Generator via Squeeze and Span
Yu Yang
Xiaotian Cheng
Chang-rui Liu
Hakan Bilen
Xiang Ji
GAN
98
0
0
06 Nov 2022
Local Manifold Augmentation for Multiview Semantic Consistency
Yu Yang
Wing Yin Cheung
Chang-rui Liu
Xiang Ji
80
1
0
05 Nov 2022
Rethinking the transfer learning for FCN based polyp segmentation in colonoscopy
Yan-mao Wen
Lei Zhang
Xiangli Meng
Xujiong Ye
55
14
0
04 Nov 2022
Unsupervised Visual Representation Learning via Mutual Information Regularized Assignment
Dong Lee
Sung-Ik Choi
Hyunwoo J. Kim
Sae-Young Chung
SSL
97
7
0
04 Nov 2022
Pixel-Wise Contrastive Distillation
Junqiang Huang
Zichao Guo
133
4
0
01 Nov 2022
Adaptive Compression for Communication-Efficient Distributed Training
Maksim Makarenko
Elnur Gasanov
Rustem Islamov
Abdurakhmon Sadiev
Peter Richtárik
123
16
0
31 Oct 2022
Class Interference of Deep Neural Networks
Dongcui Diao
Hengshuai Yao
Bei Jiang
49
1
0
31 Oct 2022
A simple, efficient and scalable contrastive masked autoencoder for learning visual representations
Shlok Kumar Mishra
Joshua Robinson
Huiwen Chang
David Jacobs
Aaron Sarna
Aaron Maschinot
Dilip Krishnan
DiffM
114
31
0
30 Oct 2022
Parameter-Efficient Tuning Makes a Good Classification Head
Zhuoyi Yang
Ming Ding
Yanhui Guo
Qingsong Lv
Jie Tang
VLM
108
14
0
30 Oct 2022
Auxo: Efficient Federated Learning via Scalable Client Clustering
Jiachen Liu
Fan Lai
Yinwei Dai
Aditya Akella
H. Madhyastha
Mosharaf Chowdhury
120
10
0
29 Oct 2022
DORE: Document Ordered Relation Extraction based on Generative Framework
Qipeng Guo
Yuqing Yang
Hang Yan
Xipeng Qiu
Zheng Zhang
127
7
0
28 Oct 2022
Facial Action Unit Detection and Intensity Estimation from Self-supervised Representation
Bowen Ma
Rudong An
Wei Zhang
Yu-qiong Ding
Zeng Zhao
Rongsheng Zhang
Tangjie Lv
Changjie Fan
Zhipeng Hu
CVBM
103
21
0
28 Oct 2022
1st Place Solution of The Robust Vision Challenge 2022 Semantic Segmentation Track
Junfei Xiao
Zhichao Xu
Shiyi Lan
Zhiding Yu
Alan Yuille
Anima Anandkumar
81
5
0
23 Oct 2022
Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification
Junfei Xiao
Yutong Bai
Alan Yuille
Zongwei Zhou
MedIm
ViT
82
62
0
23 Oct 2022
Rethinking Rotation in Self-Supervised Contrastive Learning: Adaptive Positive or Negative Data Augmentation
Atsuyuki Miyai
Qing Yu
Daiki Ikami
Go Irie
Kiyoharu Aizawa
SSL
91
5
0
23 Oct 2022
Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation
Zeyun Zhong
David Schneider
Michael Voit
Rainer Stiefelhagen
Jürgen Beyerer
121
47
0
23 Oct 2022
A New Perspective for Understanding Generalization Gap of Deep Neural Networks Trained with Large Batch Sizes
O. Oyedotun
Konstantinos Papadopoulos
Djamila Aouada
AI4CE
82
12
0
21 Oct 2022
Self-Supervised Learning via Maximum Entropy Coding
Xin Liu
Zhongdao Wang
Yali Li
Shengjin Wang
SSL
134
43
0
20 Oct 2022
A Survey of Computer Vision Technologies In Urban and Controlled-environment Agriculture
Jiayun Luo
Boyang Albert Li
Cyril Leung
152
15
0
20 Oct 2022
Large-batch Optimization for Dense Visual Predictions
Zeyue Xue
Jianming Liang
Guanglu Song
Zhuofan Zong
Liang Chen
Yu Liu
Ping Luo
VLM
96
9
0
20 Oct 2022
SSiT: Saliency-guided Self-supervised Image Transformer for Diabetic Retinopathy Grading
Yijin Huang
Junyan Lyu
Pujin Cheng
Roger Tam
Xiaoying Tang
ViT
MedIm
100
20
0
20 Oct 2022
Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Muralidhar Andoorveedu
Zhanda Zhu
Bojian Zheng
Gennady Pekhimenko
47
7
0
19 Oct 2022
Self-Supervised Learning Through Efference Copies
Franz Scherr
Qinghai Guo
Timoleon Moraitis
76
11
0
17 Oct 2022
A Unified Positive-Unlabeled Learning Framework for Document-Level Relation Extraction with Different Levels of Labeling
Ye Wang
Xin-Xin Liu
Wen-zhong Hu
Tao Zhang
75
19
0
17 Oct 2022
Accelerating Transfer Learning with Near-Data Computation on Cloud Object Stores
Arsany Guirguis
Diana Petrescu
Florin Dinu
D. Quoc
Javier Picorel
R. Guerraoui
73
0
0
16 Oct 2022
Bandwidth-efficient distributed neural network architectures with application to body sensor networks
Thomas Strypsteen
Alexander Bertrand
35
1
0
14 Oct 2022
Vision Transformers provably learn spatial structure
Samy Jelassi
Michael E. Sander
Yuan-Fang Li
ViT
MLT
100
83
0
13 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
109
51
0
13 Oct 2022
The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers
Zong-xiao Li
Chong You
Srinadh Bhojanapalli
Daliang Li
A. S. Rawat
...
Kenneth Q Ye
Felix Chern
Felix X. Yu
Ruiqi Guo
Surinder Kumar
MoE
102
97
0
12 Oct 2022
Towards Theoretically Inspired Neural Initialization Optimization
Yibo Yang
Hong Wang
Haobo Yuan
Zhouchen Lin
73
11
0
12 Oct 2022
Decomposed Knowledge Distillation for Class-Incremental Semantic Segmentation
Donghyeon Baek
Youngmin Oh
Sanghoon Lee
Junghyup Lee
Bumsub Ham
CLL
VLM
48
29
0
12 Oct 2022
VER: Scaling On-Policy RL Leads to the Emergence of Navigation in Embodied Rearrangement
Erik Wijmans
Irfan Essa
Dhruv Batra
OffRL
115
14
0
11 Oct 2022
Revisiting adapters with adversarial training
Sylvestre-Alvise Rebuffi
Francesco Croce
Sven Gowal
AAML
62
17
0
10 Oct 2022
Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis
Yuxin Xiao
Paul Pu Liang
Umang Bhatt
Willie Neiswanger
Ruslan Salakhutdinov
Louis-Philippe Morency
253
98
0
10 Oct 2022
VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature Alignment
Shraman Pramanick
Li Jing
Sayan Nag
Jiachen Zhu
Hardik Shah
Yann LeCun
Ramalingam Chellappa
82
22
0
09 Oct 2022
Previous
1
2
3
...
11
12
13
...
40
41
42
Next