ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.02677
  4. Cited By
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
v1v2 (latest)

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
    3DH
ArXiv (abs)PDFHTML

Papers citing "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"

50 / 2,054 papers shown
Title
Sample-Driven Federated Learning for Energy-Efficient and Real-Time IoT
  Sensing
Sample-Driven Federated Learning for Energy-Efficient and Real-Time IoT Sensing
Minh Ngoc Luu
Minh-Duong Nguyen
E. Bedeer
Van Duc Nguyen
D. Hoang
Diep N. Nguyen
Quoc-Viet Pham
62
3
0
11 Oct 2023
Excision And Recovery: Visual Defect Obfuscation Based Self-Supervised
  Anomaly Detection Strategy
Excision And Recovery: Visual Defect Obfuscation Based Self-Supervised Anomaly Detection Strategy
Yeonghyeon Park
Sungho Kang
Myung Jin Kim
Yeonho Lee
Hyeong Seok Kim
Juneho Yi
AAML
80
2
0
06 Oct 2023
DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training
DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training
Aochuan Chen
Yimeng Zhang
Jinghan Jia
James Diffenderfer
Jiancheng Liu
Konstantinos Parasyris
Yihua Zhang
Zheng Zhang
B. Kailkhura
Sijia Liu
144
48
0
03 Oct 2023
Pixel-Aligned Recurrent Queries for Multi-View 3D Object Detection
Pixel-Aligned Recurrent Queries for Multi-View 3D Object Detection
Yiming Xie
Huaizu Jiang
Georgia Gkioxari
Julian Straub
3DPC
62
9
0
02 Oct 2023
High Throughput Training of Deep Surrogates from Large Ensemble Runs
High Throughput Training of Deep Surrogates from Large Ensemble Runs
Lucas Meyer
M. Schouler
R. Caulk
Alejandro Ribés
Bruno Raffin
AI4CE
53
6
0
28 Sep 2023
Channel Vision Transformers: An Image Is Worth 1 x 16 x 16 Words
Channel Vision Transformers: An Image Is Worth 1 x 16 x 16 Words
Yu Bao
Srinivasan Sivanandan
Theofanis Karaletsos
ViT
88
27
0
28 Sep 2023
Distortion Resilience for Goal-Oriented Semantic Communication
Distortion Resilience for Goal-Oriented Semantic Communication
Minh-Duong Nguyen
Quang-Vinh Do
Zhaohui Yang
Quoc-Viet Pham
Won Joo Hwang
46
1
0
26 Sep 2023
Revisiting LARS for Large Batch Training Generalization of Neural
  Networks
Revisiting LARS for Large Batch Training Generalization of Neural Networks
K. Do
Duong Nguyen
Hoa Nguyen
Long Tran-Thanh
Nguyen-Hoang Tran
Quoc-Viet Pham
AI4CEODL
69
1
0
25 Sep 2023
Accelerating Large Batch Training via Gradient Signal to Noise Ratio
  (GSNR)
Accelerating Large Batch Training via Gradient Signal to Noise Ratio (GSNR)
Guo-qing Jiang
Jinlong Liu
Zixiang Ding
Lin Guo
W. Lin
AI4CE
56
2
0
24 Sep 2023
It's Simplex! Disaggregating Measures to Improve Certified Robustness
It's Simplex! Disaggregating Measures to Improve Certified Robustness
Andrew C. Cullen
Paul Montague
Shijie Liu
S. Erfani
Benjamin I. P. Rubinstein
78
3
0
20 Sep 2023
On the different regimes of Stochastic Gradient Descent
On the different regimes of Stochastic Gradient Descent
Antonio Sclocchi
Matthieu Wyart
71
20
0
19 Sep 2023
Zero- and Few-shot Sound Event Localization and Detection
Zero- and Few-shot Sound Event Localization and Detection
Kazuki Shimada
Kengo Uchida
Yuichiro Koyama
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
Tatsuya Kawahara
79
4
0
17 Sep 2023
Rethinking Learning Rate Tuning in the Era of Large Language Models
Rethinking Learning Rate Tuning in the Era of Large Language Models
Hongpeng Jin
Wenqi Wei
Xuyu Wang
Wenbin Zhang
Yanzhao Wu
71
11
0
16 Sep 2023
TF-SepNet: An Efficient 1D Kernel Design in CNNs for Low-Complexity
  Acoustic Scene Classification
TF-SepNet: An Efficient 1D Kernel Design in CNNs for Low-Complexity Acoustic Scene Classification
Yiqian Cai
Peihong Zhang
Shengchen Li
84
7
0
15 Sep 2023
Fast FixMatch: Faster Semi-Supervised Learning with Curriculum Batch
  Size
Fast FixMatch: Faster Semi-Supervised Learning with Curriculum Batch Size
John Chen
Chen Dun
Anastasios Kyrillidis
48
3
0
07 Sep 2023
BeeTLe: A Framework for Linear B-Cell Epitope Prediction and
  Classification
BeeTLe: A Framework for Linear B-Cell Epitope Prediction and Classification
Xiao Yuan
57
3
0
05 Sep 2023
TSTTC: A Large-Scale Dataset for Time-to-Contact Estimation in Driving
  Scenarios
TSTTC: A Large-Scale Dataset for Time-to-Contact Estimation in Driving Scenarios
Yuheng Shi
Zehao Huang
Yan Yan
Naiyan Wang
Xiaojie Guo
51
1
0
04 Sep 2023
Fine-Grained Spatiotemporal Motion Alignment for Contrastive Video
  Representation Learning
Fine-Grained Spatiotemporal Motion Alignment for Contrastive Video Representation Learning
Minghao Zhu
Xiao Lin
Ronghao Dang
Chengju Liu
Qi Chen
VGen
83
9
0
01 Sep 2023
From SMOTE to Mixup for Deep Imbalanced Classification
From SMOTE to Mixup for Deep Imbalanced Classification
Wei Cheng
Tan-Ha Mai
Hsuan-Tien Lin
51
3
0
29 Aug 2023
ABS-SGD: A Delayed Synchronous Stochastic Gradient Descent Algorithm
  with Adaptive Batch Size for Heterogeneous GPU Clusters
ABS-SGD: A Delayed Synchronous Stochastic Gradient Descent Algorithm with Adaptive Batch Size for Heterogeneous GPU Clusters
Xin Zhou
Ling Chen
Houming Wu
51
0
0
29 Aug 2023
FwdLLM: Efficient FedLLM using Forward Gradient
FwdLLM: Efficient FedLLM using Forward Gradient
Mengwei Xu
Dongqi Cai
Yaozong Wu
Xiang Li
Shangguang Wang
FedML
118
26
0
26 Aug 2023
IncreLoRA: Incremental Parameter Allocation Method for
  Parameter-Efficient Fine-tuning
IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning
Feiyu F. Zhang
Liangzhi Li
Jun-Cheng Chen
Zhouqian Jiang
Bowen Wang
Yiming Qian
95
37
0
23 Aug 2023
Opening the Vocabulary of Egocentric Actions
Opening the Vocabulary of Egocentric Actions
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
VLM
101
18
0
22 Aug 2023
Enhancing Interpretable Object Abstraction via Clustering-based Slot
  Initialization
Enhancing Interpretable Object Abstraction via Clustering-based Slot Initialization
Ni Gao
Bernard Hohmann
Gerhard Neumann
OCL
47
2
0
22 Aug 2023
CoNe: Contrast Your Neighbours for Supervised Image Classification
CoNe: Contrast Your Neighbours for Supervised Image Classification
Mingkai Zheng
Shan You
Lang Huang
Xiu Su
Fei Wang
Chao Qian
Xiaogang Wang
Chang Xu
VLM
57
0
0
21 Aug 2023
Investigation of Architectures and Receptive Fields for Appearance-based
  Gaze Estimation
Investigation of Architectures and Receptive Fields for Appearance-based Gaze Estimation
Yunhan Wang
Xiangwei Shi
Shalini De Mello
H. Chang
Xucong Zhang
CVBM
52
3
0
18 Aug 2023
MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain
  Conversation
MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation
Junru Lu
Siyu An
Mingbao Lin
Gabriele Pergola
Yulan He
Di Yin
Xing Sun
Yunsheng Wu
127
40
0
16 Aug 2023
SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision
  Tasks with Real-time Performance on Mobile Device
SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision Tasks with Real-time Performance on Mobile Device
Wei Gou
Ziyao Yi
Yan Xiang
Sha Li
Zibin Liu
Dehui Kong
Ke Xu
67
5
0
16 Aug 2023
TorchQL: A Programming Framework for Integrity Constraints in Machine
  Learning
TorchQL: A Programming Framework for Integrity Constraints in Machine Learning
Aaditya Naik
Adam Stein
Yinjun Wu
Mayur Naik
Eric Wong
76
3
0
13 Aug 2023
Self-supervised Learning of Rotation-invariant 3D Point Set Features
  using Transformer and its Self-distillation
Self-supervised Learning of Rotation-invariant 3D Point Set Features using Transformer and its Self-distillation
T. Furuya
Zhoujie Chen
Ryutarou Ohbuchi
Zhenzhong Kuang
3DPC
60
2
0
09 Aug 2023
Which Tokens to Use? Investigating Token Reduction in Vision
  Transformers
Which Tokens to Use? Investigating Token Reduction in Vision Transformers
Joakim Bruslund Haurum
Sergio Escalera
Graham W. Taylor
T. Moeslund
ViT
104
38
0
09 Aug 2023
RecycleGPT: An Autoregressive Language Model with Recyclable Module
RecycleGPT: An Autoregressive Language Model with Recyclable Module
Yu Jiang
Qiaozhi He
Xiaomin Zhuang
Zhihua Wu
Kunpeng Wang
Wenlai Zhao
Guangwen Yang
KELM
76
3
0
07 Aug 2023
Serverless Federated AUPRC Optimization for Multi-Party Collaborative
  Imbalanced Data Mining
Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining
Xidong Wu
Zhengmian Hu
Jian Pei
Heng Huang
97
12
0
06 Aug 2023
FROD: Robust Object Detection for Free
FROD: Robust Object Detection for Free
Muhammad Awais
Awais
Weiming Zhuang
Zhuang
Lingjuan
Lingjuan Lyu
Sung-Ho
Sung-Ho Bae
ObjD
89
1
0
03 Aug 2023
MusicLDM: Enhancing Novelty in Text-to-Music Generation Using
  Beat-Synchronous Mixup Strategies
MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies
Kai Chen
Yusong Wu
Haohe Liu
Marianna Nezhurina
Taylor Berg-Kirkpatrick
Shlomo Dubnov
DiffM
94
81
0
03 Aug 2023
CASSINI: Network-Aware Job Scheduling in Machine Learning Clusters
CASSINI: Network-Aware Job Scheduling in Machine Learning Clusters
S. Rajasekaran
M. Ghobadi
Aditya Akella
GNN
87
32
0
01 Aug 2023
Improving Pixel-based MIM by Reducing Wasted Modeling Capability
Improving Pixel-based MIM by Reducing Wasted Modeling Capability
Yuan Liu
Songyang Zhang
Jiacheng Chen
Zhaohui Yu
Kai-xiang Chen
Dahua Lin
104
32
0
01 Aug 2023
The Marginal Value of Momentum for Small Learning Rate SGD
The Marginal Value of Momentum for Small Learning Rate SGD
Runzhe Wang
Sadhika Malladi
Tianhao Wang
Kaifeng Lyu
Zhiyuan Li
ODL
82
9
0
27 Jul 2023
How to Scale Your EMA
How to Scale Your EMA
Dan Busbridge
Jason Ramapuram
Pierre Ablin
Tatiana Likhomanenko
Eeshan Gunesh Dhekane
Xavier Suau
Russ Webb
82
19
0
25 Jul 2023
Tackling the Curse of Dimensionality with Physics-Informed Neural
  Networks
Tackling the Curse of Dimensionality with Physics-Informed Neural Networks
Zheyuan Hu
K. Shukla
George Karniadakis
Kenji Kawaguchi
PINNAI4CE
180
104
0
23 Jul 2023
Robust Fully-Asynchronous Methods for Distributed Training over General
  Architecture
Robust Fully-Asynchronous Methods for Distributed Training over General Architecture
Zehan Zhu
Ye Tian
Yan Huang
Jinming Xu
Shibo He
OOD
85
2
0
21 Jul 2023
Tuning Pre-trained Model via Moment Probing
Tuning Pre-trained Model via Moment Probing
Mingze Gao
Qilong Wang
Zhenyi Lin
Pengfei Zhu
Qinghua Hu
Jingbo Zhou
76
8
0
21 Jul 2023
The Role of Entropy and Reconstruction in Multi-View Self-Supervised
  Learning
The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning
Borja Rodríguez Gálvez
Arno Blaas
P. Rodríguez
Adam Goliñski
Xavier Suau
Jason Ramapuram
Dan Busbridge
Luca Zappella
82
7
0
20 Jul 2023
Multi-objective Evolutionary Search of Variable-length Composite
  Semantic Perturbations
Multi-objective Evolutionary Search of Variable-length Composite Semantic Perturbations
Jialiang Sun
Wen Yao
Tingsong Jiang
Xiaoqian Chen
AAML
55
0
0
13 Jul 2023
AxonCallosumEM Dataset: Axon Semantic Segmentation of Whole Corpus
  Callosum cross section from EM Images
AxonCallosumEM Dataset: Axon Semantic Segmentation of Whole Corpus Callosum cross section from EM Images
Ao Cheng
Guoqiang Zhao
Lirong Wang
Ruobing Zhang
54
3
0
05 Jul 2023
CAME: Confidence-guided Adaptive Memory Efficient Optimization
CAME: Confidence-guided Adaptive Memory Efficient Optimization
Yang Luo
Xiaozhe Ren
Zangwei Zheng
Zhuo Jiang
Xin Jiang
Yang You
ODL
87
22
0
05 Jul 2023
Review helps learn better: Temporal Supervised Knowledge Distillation
Review helps learn better: Temporal Supervised Knowledge Distillation
Dongwei Wang
Zhi Han
Yanmei Wang
Xi’ai Chen
Baichen Liu
Yandong Tang
148
1
0
03 Jul 2023
OSP: Boosting Distributed Model Training with 2-stage Synchronization
OSP: Boosting Distributed Model Training with 2-stage Synchronization
Zixuan Chen
Lei Shi
Xuandong Liu
Jiahui Li
Sen Liu
Yang Xu
105
4
0
29 Jun 2023
Towards a Better Theoretical Understanding of Independent Subnetwork
  Training
Towards a Better Theoretical Understanding of Independent Subnetwork Training
Egor Shulgin
Peter Richtárik
AI4CE
108
6
0
28 Jun 2023
Separable Physics-Informed Neural Networks
Separable Physics-Informed Neural Networks
Junwoo Cho
Seungtae Nam
Hyunmo Yang
S. Yun
Youngjoon Hong
Eunbyung Park
PINNAI4CE
88
47
0
28 Jun 2023
Previous
123...678...404142
Next