ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.02677
  4. Cited By
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
v1v2 (latest)

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
    3DH
ArXiv (abs)PDFHTML

Papers citing "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"

50 / 2,054 papers shown
Title
Straightening Out the Straight-Through Estimator: Overcoming
  Optimization Challenges in Vector Quantized Networks
Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks
Minyoung Huh
Brian Cheung
Pulkit Agrawal
Phillip Isola
MQ
54
55
0
15 May 2023
GeNAS: Neural Architecture Search with Better Generalization
GeNAS: Neural Architecture Search with Better Generalization
Joonhyun Jeong
Joonsang Yu
Geondo Park
Dongyoon Han
Y. Yoo
78
4
0
15 May 2023
An Inverse Scaling Law for CLIP Training
An Inverse Scaling Law for CLIP Training
Xianhang Li
Zeyu Wang
Cihang Xie
VLMCLIP
117
58
0
11 May 2023
Region-Aware Pretraining for Open-Vocabulary Object Detection with
  Vision Transformers
Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Dahun Kim
A. Angelova
Weicheng Kuo
ObjDViTVLM
84
80
0
11 May 2023
Boosting Distributed Machine Learning Training Through Loss-tolerant
  Transmission Protocol
Boosting Distributed Machine Learning Training Through Loss-tolerant Transmission Protocol
Zixuan Chen
Lei Shi
Xuandong Liu
Xin Ai
Sen Liu
Yang Xu
27
6
0
07 May 2023
Annotation-efficient learning for OCT segmentation
Annotation-efficient learning for OCT segmentation
Haoran Zhang
Jianlong Yang
Ce Zheng
Shiqing Zhao
Aili Zhang
MedIm
284
8
0
06 May 2023
A vector quantized masked autoencoder for audiovisual speech emotion recognition
A vector quantized masked autoencoder for audiovisual speech emotion recognition
Samir Sadok
Simon Leglaive
Renaud Séguier
SSL
179
6
0
05 May 2023
Cuttlefish: Low-Rank Model Training without All the Tuning
Cuttlefish: Low-Rank Model Training without All the Tuning
Hongyi Wang
Saurabh Agarwal
Pongsakorn U-chupala
Yoshiki Tanaka
Eric P. Xing
Dimitris Papailiopoulos
OffRL
154
23
0
04 May 2023
Dynamic Sparse Training with Structured Sparsity
Dynamic Sparse Training with Structured Sparsity
Mike Lasby
A. Golubeva
Utku Evci
Mihai Nica
Yani Andrew Ioannou
178
22
0
03 May 2023
The Training Process of Many Deep Networks Explores the Same
  Low-Dimensional Manifold
The Training Process of Many Deep Networks Explores the Same Low-Dimensional Manifold
Jialin Mao
Itay Griniasty
H. Teoh
Rahul Ramesh
Rubing Yang
Mark K. Transtrum
James P. Sethna
Pratik Chaudhari
3DPC
85
16
0
02 May 2023
Random Function Descent
Random Function Descent
Felix Benning
L. Döring
39
0
0
02 May 2023
Performance and Energy Consumption of Parallel Machine Learning
  Algorithms
Performance and Energy Consumption of Parallel Machine Learning Algorithms
Xidong Wu
Preston Brazzle
Stephen Cahoon
120
0
0
01 May 2023
Domain Adaptive and Generalizable Network Architectures and Training
  Strategies for Semantic Image Segmentation
Domain Adaptive and Generalizable Network Architectures and Training Strategies for Semantic Image Segmentation
Lukas Hoyer
Dengxin Dai
Luc Van Gool
AI4CEOOD
131
26
0
26 Apr 2023
Img2Vec: A Teacher of High Token-Diversity Helps Masked AutoEncoders
Img2Vec: A Teacher of High Token-Diversity Helps Masked AutoEncoders
Heng Pan
Chenyang Liu
Wenxiao Wang
Liejie Yuan
Hongfa Wang
Zhifeng Li
Wen Liu
VLM
64
3
0
25 Apr 2023
A Cookbook of Self-Supervised Learning
A Cookbook of Self-Supervised Learning
Randall Balestriero
Mark Ibrahim
Vlad Sobal
Ari S. Morcos
Shashank Shekhar
...
Pierre Fernandez
Amir Bar
Hamed Pirsiavash
Yann LeCun
Micah Goldblum
SyDaFedMLSSL
161
284
0
24 Apr 2023
The Disharmony between BN and ReLU Causes Gradient Explosion, but is
  Offset by the Correlation between Activations
The Disharmony between BN and ReLU Causes Gradient Explosion, but is Offset by the Correlation between Activations
Inyoung Paik
Jaesik Choi
81
1
0
23 Apr 2023
A vector quantized masked autoencoder for speech emotion recognition
A vector quantized masked autoencoder for speech emotion recognition
Samir Sadok
Simon Leglaive
Renaud Séguier
113
22
0
21 Apr 2023
A Plug-and-Play Defensive Perturbation for Copyright Protection of
  DNN-based Applications
A Plug-and-Play Defensive Perturbation for Copyright Protection of DNN-based Applications
Donghua Wang
Wen Yao
Tingsong Jiang
Weien Zhou
Lang Lin
Xiaoqian Chen
AAML
85
4
0
20 Apr 2023
Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget
Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget
Johannes Lehner
Benedikt Alkin
Andreas Fürst
Elisabeth Rumetshofer
Lukas Miklautz
Sepp Hochreiter
111
18
0
20 Apr 2023
Securing Neural Networks with Knapsack Optimization
Securing Neural Networks with Knapsack Optimization
Yakir Gorski
Amir Jevnisek
S. Avidan
AAML
45
0
0
20 Apr 2023
LipsFormer: Introducing Lipschitz Continuity to Vision Transformers
LipsFormer: Introducing Lipschitz Continuity to Vision Transformers
Xianbiao Qi
Jianan Wang
Yihao Chen
Yukai Shi
Lei Zhang
98
20
0
19 Apr 2023
Parallel Neural Networks in Golang
Parallel Neural Networks in Golang
Daniela Kalwarowskyj
Erich Schikuta
GNN
32
0
0
19 Apr 2023
Convergence of stochastic gradient descent under a local Lojasiewicz
  condition for deep neural networks
Convergence of stochastic gradient descent under a local Lojasiewicz condition for deep neural networks
Jing An
Jianfeng Lu
62
4
0
18 Apr 2023
Understand Data Preprocessing for Effective End-to-End Training of Deep
  Neural Networks
Understand Data Preprocessing for Effective End-to-End Training of Deep Neural Networks
Ping Gong
Yuxin Ma
Cheng-rong Li
Xiaosong Ma
S. Noh
49
2
0
18 Apr 2023
DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP
  Training
DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training
Yihao Chen
Xianbiao Qi
Jianan Wang
Lei Zhang
82
18
0
17 Apr 2023
DETR-based Layered Clothing Segmentation and Fine-Grained Attribute
  Recognition
DETR-based Layered Clothing Segmentation and Fine-Grained Attribute Recognition
Hao Tian
Yu Cao
P. Y. Mok
ViT
108
4
0
17 Apr 2023
EasyNER: A Customizable Easy-to-Use Pipeline for Deep Learning- and
  Dictionary-based Named Entity Recognition from Medical Text
EasyNER: A Customizable Easy-to-Use Pipeline for Deep Learning- and Dictionary-based Named Entity Recognition from Medical Text
Rafsan Ahmed
P. Berntsson
Alexander Skafte
Salma Kazemi Rashed
Marcus Klang
...
Ola Olde
William Lindholm
Antton Lamarca Arrizabalaga
P. Nugues
S. Aits
13
3
0
16 Apr 2023
ALiSNet: Accurate and Lightweight Human Segmentation Network for Fashion
  E-Commerce
ALiSNet: Accurate and Lightweight Human Segmentation Network for Fashion E-Commerce
Amrollah Seifoddini
K. Vernooij
Timon Künzle
A. Canopoli
Malte F. Alf
Anna Volokitin
Reza Shirvany
3DH
57
0
0
15 Apr 2023
Transfer Knowledge from Head to Tail: Uncertainty Calibration under
  Long-tailed Distribution
Transfer Knowledge from Head to Tail: Uncertainty Calibration under Long-tailed Distribution
Jiahao Chen
Bingyue Su
69
15
0
13 Apr 2023
Deep neural networks have an inbuilt Occam's razor
Deep neural networks have an inbuilt Occam's razor
Chris Mingard
Henry Rees
Guillermo Valle Pérez
A. Louis
UQCVBDL
74
16
0
13 Apr 2023
Hard Patches Mining for Masked Image Modeling
Hard Patches Mining for Masked Image Modeling
Haochen Wang
Kaiyou Song
Junsong Fan
Yuxi Wang
Jin Xie
Zhaoxiang Zhang
68
63
0
12 Apr 2023
Homogenizing Non-IID datasets via In-Distribution Knowledge Distillation
  for Decentralized Learning
Homogenizing Non-IID datasets via In-Distribution Knowledge Distillation for Decentralized Learning
Deepak Ravikumar
Gobinda Saha
Sai Aparna Aketi
Kaushik Roy
88
2
0
09 Apr 2023
Propheter: Prophetic Teacher Guided Long-Tailed Distribution Learning
Propheter: Prophetic Teacher Guided Long-Tailed Distribution Learning
Wenxiang Xu
Lin Chen
Linyun Zhou
Jie Lei
Lechao Cheng
Zunlei Feng
Min-Gyoo Song
77
1
0
09 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
105
43
0
07 Apr 2023
Diffusion Models as Masked Autoencoders
Diffusion Models as Masked Autoencoders
Chen Wei
K. Mangalam
Po-Yao (Bernie) Huang
Yanghao Li
Haoqi Fan
Hu Xu
Huiyu Wang
Cihang Xie
Alan Yuille
Christoph Feichtenhofer
DiffMSyDa
100
53
0
06 Apr 2023
Bridging the Language Gap: Knowledge Injected Multilingual Question
  Answering
Bridging the Language Gap: Knowledge Injected Multilingual Question Answering
Zhichao Duan
Xiuxing Li
Zhengyan Zhang
Zhenyu Li
Ning Liu
Jianyong Wang
67
8
0
06 Apr 2023
Inductive biases in deep learning models for weather prediction
Inductive biases in deep learning models for weather prediction
Jannik Thümmel
Matthias Karlbauer
S. Otte
C. Zarfl
Georg Martius
...
Thomas Scholten
Ulrich Friedrich
V. Wulfmeyer
B. Goswami
Martin Volker Butz
AI4CE
107
6
0
06 Apr 2023
Robustmix: Improving Robustness by Regularizing the Frequency Bias of
  Deep Nets
Robustmix: Improving Robustness by Regularizing the Frequency Bias of Deep Nets
Jonas Ngnawé
Marianne Abémgnigni Njifon
Jonathan Heek
Yann N. Dauphin
OOD
42
5
0
06 Apr 2023
Segment Anything
Segment Anything
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
...
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLMVLM
468
7,471
0
05 Apr 2023
Pac-HuBERT: Self-Supervised Music Source Separation via Primitive
  Auditory Clustering and Hidden-Unit BERT
Pac-HuBERT: Self-Supervised Music Source Separation via Primitive Auditory Clustering and Hidden-Unit BERT
Kai Chen
Gordon Wichern
Franccois G. Germain
Jonathan Le Roux
AI4TS
62
0
0
04 Apr 2023
Effective Theory of Transformers at Initialization
Effective Theory of Transformers at Initialization
Emily Dinan
Sho Yaida
Susan Zhang
89
16
0
04 Apr 2023
ERM++: An Improved Baseline for Domain Generalization
ERM++: An Improved Baseline for Domain Generalization
Piotr Teterwak
Kuniaki Saito
Theodoros Tsiligkaridis
Kate Saenko
Bryan A. Plummer
OOD
93
10
0
04 Apr 2023
Exploration of Lightweight Single Image Denoising with Transformers and
  Truly Fair Training
Exploration of Lightweight Single Image Denoising with Transformers and Truly Fair Training
Haram Choi
Cheolwoong Na
Jinseop S. Kim
Jihoon Yang
ViT
53
3
0
04 Apr 2023
SLPerf: a Unified Framework for Benchmarking Split Learning
SLPerf: a Unified Framework for Benchmarking Split Learning
Tianchen Zhou
Zhanyi Hu
Bingzhe Wu
Cen Chen
FedML
80
4
0
04 Apr 2023
Pythia: A Suite for Analyzing Large Language Models Across Training and
  Scaling
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
Stella Biderman
Hailey Schoelkopf
Quentin G. Anthony
Herbie Bradley
Kyle O'Brien
...
USVSN Sai Prashanth
Edward Raff
Aviya Skowron
Lintang Sutawika
Oskar van der Wal
151
1,311
0
03 Apr 2023
Towards Understanding the Mechanism of Contrastive Learning via
  Similarity Structure: A Theoretical Analysis
Towards Understanding the Mechanism of Contrastive Learning via Similarity Structure: A Theoretical Analysis
Hiroki Waida
Yuichiro Wada
Léo Andéol
Takumi Nakagawa
Yuhui Zhang
Takafumi Kanamori
SSL
79
6
0
01 Apr 2023
DOAD: Decoupled One Stage Action Detection Network
DOAD: Decoupled One Stage Action Detection Network
Shuning Chang
Pichao Wang
Fan Wang
Jiashi Feng
Mike Zheng Show
65
4
0
01 Apr 2023
Analysis of Failures and Risks in Deep Learning Model Converters: A Case
  Study in the ONNX Ecosystem
Analysis of Failures and Risks in Deep Learning Model Converters: A Case Study in the ONNX Ecosystem
Purvish Jajal
Wenxin Jiang
Arav Tewari
Erik Kocinare
Joseph Woo
Anusha Sarraf
Yung-Hsiang Lu
George K. Thiruvathukal
James C. Davis
62
0
0
30 Mar 2023
Neglected Free Lunch -- Learning Image Classifiers Using Annotation
  Byproducts
Neglected Free Lunch -- Learning Image Classifiers Using Annotation Byproducts
Dongyoon Han
Junsuk Choe
Dante Chun
John Joon Young Chung
Minsuk Chang
Sangdoo Yun
Jean Y. Song
Seong Joon Oh
OOD
654
4
1
30 Mar 2023
Soft Neighbors are Positive Supporters in Contrastive Visual
  Representation Learning
Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning
Chongjian Ge
Jiangliu Wang
Zhan Tong
Shoufa Chen
Yibing Song
Ping Luo
SSL
75
28
0
30 Mar 2023
Previous
123...8910...404142
Next