ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.02677
  4. Cited By
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
v1v2 (latest)

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
    3DH
ArXiv (abs)PDFHTML

Papers citing "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"

50 / 2,054 papers shown
Title
Towards Understanding the Effect of Pretraining Label Granularity
Towards Understanding the Effect of Pretraining Label Granularity
Guanzhe Hong
Huayu Chen
Ariel Fuxman
Stanley H. Chan
Enming Luo
58
2
0
29 Mar 2023
CycleACR: Cycle Modeling of Actor-Context Relations for Video Action
  Detection
CycleACR: Cycle Modeling of Actor-Context Relations for Video Action Detection
Lei Chen
Zhan Tong
Yibing Song
Gangshan Wu
Limin Wang
55
3
0
28 Mar 2023
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Kunchang Li
Yali Wang
Yizhuo Li
Yi Wang
Yinan He
Limin Wang
Yu Qiao
VGen
136
169
0
28 Mar 2023
TabRet: Pre-training Transformer-based Tabular Models for Unseen Columns
TabRet: Pre-training Transformer-based Tabular Models for Unseen Columns
Soma Onishi
Kenta Oono
Kohei Hayashi
LMTD
70
16
0
28 Mar 2023
Solving Regularized Exp, Cosh and Sinh Regression Problems
Solving Regularized Exp, Cosh and Sinh Regression Problems
Zhihang Li
Zhao Song
Dinesh Manocha
97
39
0
28 Mar 2023
Large-scale pretraining on pathological images for fine-tuning of small
  pathological benchmarks
Large-scale pretraining on pathological images for fine-tuning of small pathological benchmarks
Masataka Kawai
Noriaki Ota
Shinsuke Yamaoka
58
6
0
28 Mar 2023
Selective Structured State-Spaces for Long-Form Video Understanding
Selective Structured State-Spaces for Long-Form Video Understanding
Jue Wang
Wenjie Zhu
Pichao Wang
Xiang Yu
Linda Liu
Mohamed Omar
Raffay Hamid
92
100
0
25 Mar 2023
Mathematical Challenges in Deep Learning
Mathematical Challenges in Deep Learning
V. Nia
Guojun Zhang
I. Kobyzev
Michael R. Metel
Xinlin Li
...
S. Hemati
M. Asgharian
Linglong Kong
Wulong Liu
Boxing Chen
AI4CEVLM
72
1
0
24 Mar 2023
Fairness Improves Learning from Noisily Labeled Long-Tailed Data
Fairness Improves Learning from Noisily Labeled Long-Tailed Data
Jiaheng Wei
Zhaowei Zhu
Gang Niu
Tongliang Liu
Sijia Liu
Masashi Sugiyama
Yang Liu
68
7
0
22 Mar 2023
Pre-NeRF 360: Enriching Unbounded Appearances for Neural Radiance Fields
Pre-NeRF 360: Enriching Unbounded Appearances for Neural Radiance Fields
Ahmad AlMughrabi
Umair Haroon
Ricardo Marques
Petia Radeva
66
6
0
21 Mar 2023
ViC-MAE: Self-Supervised Representation Learning from Images and Video
  with Contrastive Masked Autoencoders
ViC-MAE: Self-Supervised Representation Learning from Images and Video with Contrastive Masked Autoencoders
J. Hernandez
Ruben Villegas
Vicente Ordonez
SSL
75
4
0
21 Mar 2023
Texture Learning Domain Randomization for Domain Generalized
  Segmentation
Texture Learning Domain Randomization for Domain Generalized Segmentation
Sunghwan Kim
Dae-Hwan Kim
Hoseong Kim
93
18
0
21 Mar 2023
More From Less: Self-Supervised Knowledge Distillation for Routine
  Histopathology Data
More From Less: Self-Supervised Knowledge Distillation for Routine Histopathology Data
Lucas Farndale
R. Insall
Ke Yuan
64
3
0
19 Mar 2023
Towards a Smaller Student: Capacity Dynamic Distillation for Efficient
  Image Retrieval
Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval
Yi Xie
Huaidong Zhang
Xuemiao Xu
Jianqing Zhu
Shengfeng He
VLM
58
14
0
16 Mar 2023
BiFormer: Vision Transformer with Bi-Level Routing Attention
BiFormer: Vision Transformer with Bi-Level Routing Attention
Lei Zhu
Xinjiang Wang
Zhanghan Ke
Wayne Zhang
Rynson W. H. Lau
192
539
0
15 Mar 2023
DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception
DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception
Jiayu Zou
Zheng Hua Zhu
Yun Ye
Xingang Wang
DiffM
64
23
0
15 Mar 2023
Window-Based Early-Exit Cascades for Uncertainty Estimation: When Deep
  Ensembles are More Efficient than Single Models
Window-Based Early-Exit Cascades for Uncertainty Estimation: When Deep Ensembles are More Efficient than Single Models
Guoxuan Xia
C. Bouganis
UQCV
113
12
0
14 Mar 2023
Challenges and Practices of Deep Learning Model Reengineering: A Case
  Study on Computer Vision
Challenges and Practices of Deep Learning Model Reengineering: A Case Study on Computer Vision
Wenxin Jiang
Vishnu Banna
Naveen Vivek
Abhinav Goel
Nicholas Synovic
George K. Thiruvathukal
James C. Davis
VLM
81
23
0
13 Mar 2023
An Improved Baseline Framework for Pose Estimation Challenge at ECCV
  2022 Visual Perception for Navigation in Human Environments Workshop
An Improved Baseline Framework for Pose Estimation Challenge at ECCV 2022 Visual Perception for Navigation in Human Environments Workshop
Jiajun Fu
Yonghao Dang
Ruoqi Yin
Shaojie Zhang
F. Zhou
Wending Zhao
Jianqin Yin
16
1
0
13 Mar 2023
DPPMask: Masked Image Modeling with Determinantal Point Processes
DPPMask: Masked Image Modeling with Determinantal Point Processes
Junde Xu
Zikai Lin
Donghao Zhou
Yao-Cheng Yang
Xiangyun Liao
Bian Wu
Guangyong Chen
Pheng-Ann Heng
76
1
0
13 Mar 2023
Masked Image Modeling with Local Multi-Scale Reconstruction
Masked Image Modeling with Local Multi-Scale Reconstruction
Haoqing Wang
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhiwei Deng
Kai Han
90
52
0
09 Mar 2023
InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning
InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning
Ziheng Qin
Kaidi Wang
Zangwei Zheng
Jianyang Gu
Xiang Peng
...
Daquan Zhou
Lei Shang
Baigui Sun
Xuansong Xie
Yang You
187
53
0
08 Mar 2023
PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling
PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling
Yuan Liu
Songyang Zhang
Jiacheng Chen
Kai-xiang Chen
Dahua Lin
119
30
0
04 Mar 2023
What Is Missing in IRM Training and Evaluation? Challenges and Solutions
What Is Missing in IRM Training and Evaluation? Challenges and Solutions
Yihua Zhang
Pranay Sharma
Parikshit Ram
Min-Fong Hong
Kush R. Varshney
Sijia Liu
84
13
0
04 Mar 2023
EcoTTA: Memory-Efficient Continual Test-time Adaptation via
  Self-distilled Regularization
EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularization
Jun S. Song
Jungsoo Lee
In So Kweon
Sungha Choi
TTA
95
94
0
03 Mar 2023
Dropout Reduces Underfitting
Dropout Reduces Underfitting
Zhuang Liu
Zhi-Qin John Xu
Joseph Jin
Zhiqiang Shen
Trevor Darrell
160
42
0
02 Mar 2023
Efficient Masked Autoencoders with Self-Consistency
Efficient Masked Autoencoders with Self-Consistency
Zhaowen Li
Yousong Zhu
Zhiyang Chen
Wei Li
Chaoyang Zhao
Rui Zhao
Ming Tang
Jinqiao Wang
136
2
0
28 Feb 2023
Layer Grafted Pre-training: Bridging Contrastive Learning And Masked
  Image Modeling For Label-Efficient Representations
Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Label-Efficient Representations
Ziyu Jiang
Yinpeng Chen
Mengchen Liu
Dongdong Chen
Xiyang Dai
Lu Yuan
Zicheng Liu
Zhangyang Wang
SSLVLMCLIP
102
18
0
27 Feb 2023
Open Set Action Recognition via Multi-Label Evidential Learning
Open Set Action Recognition via Multi-Label Evidential Learning
Chen Zhao
Dawei Du
A. Hoogs
Christopher Funk
EDL
68
26
0
27 Feb 2023
Hulk: Graph Neural Networks for Optimizing Regionally Distributed
  Computing Systems
Hulk: Graph Neural Networks for Optimizing Regionally Distributed Computing Systems
Zheng Yuan
HU Xue
Chaoyun Zhang
Yongming Liu
GNNAI4CE
39
1
0
27 Feb 2023
DeAR: Accelerating Distributed Deep Learning with Fine-Grained
  All-Reduce Pipelining
DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining
Lin Zhang
Shaoshuai Shi
Xiaowen Chu
Wei Wang
Yue Liu
Chengjian Liu
77
11
0
24 Feb 2023
Phase diagram of early training dynamics in deep neural networks: effect
  of the learning rate, depth, and width
Phase diagram of early training dynamics in deep neural networks: effect of the learning rate, depth, and width
Dayal Singh Kalra
M. Barkeshli
133
9
0
23 Feb 2023
KHAN: Knowledge-Aware Hierarchical Attention Networks for Accurate
  Political Stance Prediction
KHAN: Knowledge-Aware Hierarchical Attention Networks for Accurate Political Stance Prediction
Yunyong Ko
Seongeun Ryu
Soeun Han
Youngseung Jeon
Jaehoon Kim
Sohyun Park
Kyungsik Han
Hanghang Tong
Sang-Wook Kim
115
15
0
23 Feb 2023
LIT-Former: Linking In-plane and Through-plane Transformers for
  Simultaneous CT Image Denoising and Deblurring
LIT-Former: Linking In-plane and Through-plane Transformers for Simultaneous CT Image Denoising and Deblurring
Zhihao Chen
Chuang Niu
Qi Gao
Ge Wang
Hongming Shan
MedImViT3DV
130
21
0
21 Feb 2023
Seasoning Model Soups for Robustness to Adversarial and Natural
  Distribution Shifts
Seasoning Model Soups for Robustness to Adversarial and Natural Distribution Shifts
Francesco Croce
Sylvestre-Alvise Rebuffi
Evan Shelhamer
Sven Gowal
AAML
79
18
0
20 Feb 2023
Interpretable Medical Image Visual Question Answering via Multi-Modal
  Relationship Graph Learning
Interpretable Medical Image Visual Question Answering via Multi-Modal Relationship Graph Learning
Xinyue Hu
Lin Gu
Kazuma Kobayashi
Qi A. An
Qingyu Chen
Zhiyong Lu
Chang Su
Tatsuya Harada
Yingying Zhu
GNN
71
10
0
19 Feb 2023
Improving Training Stability for Multitask Ranking Models in Recommender
  Systems
Improving Training Stability for Multitask Ranking Models in Recommender Systems
Jiaxi Tang
Yoel Drori
Daryl Chang
M. Sathiamoorthy
Justin Gilmer
Li Wei
Xinyang Yi
Lichan Hong
Ed H. Chi
100
10
0
17 Feb 2023
(S)GD over Diagonal Linear Networks: Implicit Regularisation, Large
  Stepsizes and Edge of Stability
(S)GD over Diagonal Linear Networks: Implicit Regularisation, Large Stepsizes and Edge of Stability
Mathieu Even
Scott Pesme
Suriya Gunasekar
Nicolas Flammarion
83
18
0
17 Feb 2023
À-la-carte Prompt Tuning (APT): Combining Distinct Data Via Composable
  Prompting
À-la-carte Prompt Tuning (APT): Combining Distinct Data Via Composable Prompting
Benjamin Bowman
Alessandro Achille
Luca Zancato
Matthew Trager
Pramuditha Perera
Giovanni Paolini
Stefano Soatto
VPVLM
91
19
0
15 Feb 2023
EPISODE: Episodic Gradient Clipping with Periodic Resampled Corrections
  for Federated Learning with Heterogeneous Data
EPISODE: Episodic Gradient Clipping with Periodic Resampled Corrections for Federated Learning with Heterogeneous Data
M. Crawshaw
Yajie Bao
Mingrui Liu
FedML
87
8
0
14 Feb 2023
SWIFT: Expedited Failure Recovery for Large-scale DNN Training
SWIFT: Expedited Failure Recovery for Large-scale DNN Training
Keon Jang
Hassan M. G. Wassel
Behnam Montazeri
Michael Ryan
David Wetherall
56
8
0
13 Feb 2023
CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets
CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets
Jiang Yang
Sheng Guo
Gangshan Wu
Limin Wang
VLM
58
7
0
13 Feb 2023
LiT Tuned Models for Efficient Species Detection
LiT Tuned Models for Efficient Species Detection
Andre Nakkab
Ben Feuer
Chinmay Hegde
VLM
33
1
0
12 Feb 2023
TTN: A Domain-Shift Aware Batch Normalization in Test-Time Adaptation
TTN: A Domain-Shift Aware Batch Normalization in Test-Time Adaptation
Hyesu Lim
Byeonggeun Kim
Jaegul Choo
Sungha Choi
TTA
83
92
0
10 Feb 2023
Better Diffusion Models Further Improve Adversarial Training
Better Diffusion Models Further Improve Adversarial Training
Zekai Wang
Tianyu Pang
Chao Du
Min Lin
Weiwei Liu
Shuicheng Yan
DiffM
106
228
0
09 Feb 2023
Optimal Stochastic Non-smooth Non-convex Optimization through
  Online-to-Non-convex Conversion
Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion
Ashok Cutkosky
Harsh Mehta
Francesco Orabona
105
34
0
07 Feb 2023
Cluster-Level Contrastive Learning for Emotion Recognition in
  Conversations
Cluster-Level Contrastive Learning for Emotion Recognition in Conversations
Kailai Yang
Tianlin Zhang
Hassan Alhuzali
Sophia Ananiadou
87
44
0
07 Feb 2023
Topology-aware Federated Learning in Edge Computing: A Comprehensive
  Survey
Topology-aware Federated Learning in Edge Computing: A Comprehensive Survey
Jiajun Wu
Steve Drew
Fan Dong
Zhuangdi Zhu
Jiayu Zhou
FedML
122
53
0
06 Feb 2023
A Survey on Efficient Training of Transformers
A Survey on Efficient Training of Transformers
Bohan Zhuang
Jing Liu
Zizheng Pan
Haoyu He
Yuetian Weng
Chunhua Shen
130
49
0
02 Feb 2023
NDJIR: Neural Direct and Joint Inverse Rendering for Geometry, Lights,
  and Materials of Real Object
NDJIR: Neural Direct and Joint Inverse Rendering for Geometry, Lights, and Materials of Real Object
K. Yoshiyama
T. Narihira
3DV
56
1
0
02 Feb 2023
Previous
123...91011...404142
Next