Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.02677
Cited By
v1
v2 (latest)
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"
50 / 2,054 papers shown
Title
Sample-Driven Federated Learning for Energy-Efficient and Real-Time IoT Sensing
Minh Ngoc Luu
Minh-Duong Nguyen
E. Bedeer
Van Duc Nguyen
D. Hoang
Diep N. Nguyen
Quoc-Viet Pham
62
3
0
11 Oct 2023
Excision And Recovery: Visual Defect Obfuscation Based Self-Supervised Anomaly Detection Strategy
Yeonghyeon Park
Sungho Kang
Myung Jin Kim
Yeonho Lee
Hyeong Seok Kim
Juneho Yi
AAML
80
2
0
06 Oct 2023
DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training
Aochuan Chen
Yimeng Zhang
Jinghan Jia
James Diffenderfer
Jiancheng Liu
Konstantinos Parasyris
Yihua Zhang
Zheng Zhang
B. Kailkhura
Sijia Liu
144
48
0
03 Oct 2023
Pixel-Aligned Recurrent Queries for Multi-View 3D Object Detection
Yiming Xie
Huaizu Jiang
Georgia Gkioxari
Julian Straub
3DPC
62
9
0
02 Oct 2023
High Throughput Training of Deep Surrogates from Large Ensemble Runs
Lucas Meyer
M. Schouler
R. Caulk
Alejandro Ribés
Bruno Raffin
AI4CE
53
6
0
28 Sep 2023
Channel Vision Transformers: An Image Is Worth 1 x 16 x 16 Words
Yu Bao
Srinivasan Sivanandan
Theofanis Karaletsos
ViT
88
27
0
28 Sep 2023
Distortion Resilience for Goal-Oriented Semantic Communication
Minh-Duong Nguyen
Quang-Vinh Do
Zhaohui Yang
Quoc-Viet Pham
Won Joo Hwang
46
1
0
26 Sep 2023
Revisiting LARS for Large Batch Training Generalization of Neural Networks
K. Do
Duong Nguyen
Hoa Nguyen
Long Tran-Thanh
Nguyen-Hoang Tran
Quoc-Viet Pham
AI4CE
ODL
69
1
0
25 Sep 2023
Accelerating Large Batch Training via Gradient Signal to Noise Ratio (GSNR)
Guo-qing Jiang
Jinlong Liu
Zixiang Ding
Lin Guo
W. Lin
AI4CE
56
2
0
24 Sep 2023
It's Simplex! Disaggregating Measures to Improve Certified Robustness
Andrew C. Cullen
Paul Montague
Shijie Liu
S. Erfani
Benjamin I. P. Rubinstein
78
3
0
20 Sep 2023
On the different regimes of Stochastic Gradient Descent
Antonio Sclocchi
Matthieu Wyart
71
20
0
19 Sep 2023
Zero- and Few-shot Sound Event Localization and Detection
Kazuki Shimada
Kengo Uchida
Yuichiro Koyama
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
Tatsuya Kawahara
79
4
0
17 Sep 2023
Rethinking Learning Rate Tuning in the Era of Large Language Models
Hongpeng Jin
Wenqi Wei
Xuyu Wang
Wenbin Zhang
Yanzhao Wu
71
11
0
16 Sep 2023
TF-SepNet: An Efficient 1D Kernel Design in CNNs for Low-Complexity Acoustic Scene Classification
Yiqian Cai
Peihong Zhang
Shengchen Li
84
7
0
15 Sep 2023
Fast FixMatch: Faster Semi-Supervised Learning with Curriculum Batch Size
John Chen
Chen Dun
Anastasios Kyrillidis
48
3
0
07 Sep 2023
BeeTLe: A Framework for Linear B-Cell Epitope Prediction and Classification
Xiao Yuan
57
3
0
05 Sep 2023
TSTTC: A Large-Scale Dataset for Time-to-Contact Estimation in Driving Scenarios
Yuheng Shi
Zehao Huang
Yan Yan
Naiyan Wang
Xiaojie Guo
51
1
0
04 Sep 2023
Fine-Grained Spatiotemporal Motion Alignment for Contrastive Video Representation Learning
Minghao Zhu
Xiao Lin
Ronghao Dang
Chengju Liu
Qi Chen
VGen
83
9
0
01 Sep 2023
From SMOTE to Mixup for Deep Imbalanced Classification
Wei Cheng
Tan-Ha Mai
Hsuan-Tien Lin
51
3
0
29 Aug 2023
ABS-SGD: A Delayed Synchronous Stochastic Gradient Descent Algorithm with Adaptive Batch Size for Heterogeneous GPU Clusters
Xin Zhou
Ling Chen
Houming Wu
51
0
0
29 Aug 2023
FwdLLM: Efficient FedLLM using Forward Gradient
Mengwei Xu
Dongqi Cai
Yaozong Wu
Xiang Li
Shangguang Wang
FedML
118
26
0
26 Aug 2023
IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning
Feiyu F. Zhang
Liangzhi Li
Jun-Cheng Chen
Zhouqian Jiang
Bowen Wang
Yiming Qian
95
37
0
23 Aug 2023
Opening the Vocabulary of Egocentric Actions
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
VLM
101
18
0
22 Aug 2023
Enhancing Interpretable Object Abstraction via Clustering-based Slot Initialization
Ni Gao
Bernard Hohmann
Gerhard Neumann
OCL
47
2
0
22 Aug 2023
CoNe: Contrast Your Neighbours for Supervised Image Classification
Mingkai Zheng
Shan You
Lang Huang
Xiu Su
Fei Wang
Chao Qian
Xiaogang Wang
Chang Xu
VLM
57
0
0
21 Aug 2023
Investigation of Architectures and Receptive Fields for Appearance-based Gaze Estimation
Yunhan Wang
Xiangwei Shi
Shalini De Mello
H. Chang
Xucong Zhang
CVBM
52
3
0
18 Aug 2023
MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation
Junru Lu
Siyu An
Mingbao Lin
Gabriele Pergola
Yulan He
Di Yin
Xing Sun
Yunsheng Wu
127
40
0
16 Aug 2023
SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision Tasks with Real-time Performance on Mobile Device
Wei Gou
Ziyao Yi
Yan Xiang
Sha Li
Zibin Liu
Dehui Kong
Ke Xu
67
5
0
16 Aug 2023
TorchQL: A Programming Framework for Integrity Constraints in Machine Learning
Aaditya Naik
Adam Stein
Yinjun Wu
Mayur Naik
Eric Wong
76
3
0
13 Aug 2023
Self-supervised Learning of Rotation-invariant 3D Point Set Features using Transformer and its Self-distillation
T. Furuya
Zhoujie Chen
Ryutarou Ohbuchi
Zhenzhong Kuang
3DPC
60
2
0
09 Aug 2023
Which Tokens to Use? Investigating Token Reduction in Vision Transformers
Joakim Bruslund Haurum
Sergio Escalera
Graham W. Taylor
T. Moeslund
ViT
104
38
0
09 Aug 2023
RecycleGPT: An Autoregressive Language Model with Recyclable Module
Yu Jiang
Qiaozhi He
Xiaomin Zhuang
Zhihua Wu
Kunpeng Wang
Wenlai Zhao
Guangwen Yang
KELM
76
3
0
07 Aug 2023
Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining
Xidong Wu
Zhengmian Hu
Jian Pei
Heng Huang
97
12
0
06 Aug 2023
FROD: Robust Object Detection for Free
Muhammad Awais
Awais
Weiming Zhuang
Zhuang
Lingjuan
Lingjuan Lyu
Sung-Ho
Sung-Ho Bae
ObjD
89
1
0
03 Aug 2023
MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies
Kai Chen
Yusong Wu
Haohe Liu
Marianna Nezhurina
Taylor Berg-Kirkpatrick
Shlomo Dubnov
DiffM
94
81
0
03 Aug 2023
CASSINI: Network-Aware Job Scheduling in Machine Learning Clusters
S. Rajasekaran
M. Ghobadi
Aditya Akella
GNN
87
32
0
01 Aug 2023
Improving Pixel-based MIM by Reducing Wasted Modeling Capability
Yuan Liu
Songyang Zhang
Jiacheng Chen
Zhaohui Yu
Kai-xiang Chen
Dahua Lin
104
32
0
01 Aug 2023
The Marginal Value of Momentum for Small Learning Rate SGD
Runzhe Wang
Sadhika Malladi
Tianhao Wang
Kaifeng Lyu
Zhiyuan Li
ODL
82
9
0
27 Jul 2023
How to Scale Your EMA
Dan Busbridge
Jason Ramapuram
Pierre Ablin
Tatiana Likhomanenko
Eeshan Gunesh Dhekane
Xavier Suau
Russ Webb
82
19
0
25 Jul 2023
Tackling the Curse of Dimensionality with Physics-Informed Neural Networks
Zheyuan Hu
K. Shukla
George Karniadakis
Kenji Kawaguchi
PINN
AI4CE
180
104
0
23 Jul 2023
Robust Fully-Asynchronous Methods for Distributed Training over General Architecture
Zehan Zhu
Ye Tian
Yan Huang
Jinming Xu
Shibo He
OOD
85
2
0
21 Jul 2023
Tuning Pre-trained Model via Moment Probing
Mingze Gao
Qilong Wang
Zhenyi Lin
Pengfei Zhu
Qinghua Hu
Jingbo Zhou
76
8
0
21 Jul 2023
The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning
Borja Rodríguez Gálvez
Arno Blaas
P. Rodríguez
Adam Goliñski
Xavier Suau
Jason Ramapuram
Dan Busbridge
Luca Zappella
82
7
0
20 Jul 2023
Multi-objective Evolutionary Search of Variable-length Composite Semantic Perturbations
Jialiang Sun
Wen Yao
Tingsong Jiang
Xiaoqian Chen
AAML
55
0
0
13 Jul 2023
AxonCallosumEM Dataset: Axon Semantic Segmentation of Whole Corpus Callosum cross section from EM Images
Ao Cheng
Guoqiang Zhao
Lirong Wang
Ruobing Zhang
54
3
0
05 Jul 2023
CAME: Confidence-guided Adaptive Memory Efficient Optimization
Yang Luo
Xiaozhe Ren
Zangwei Zheng
Zhuo Jiang
Xin Jiang
Yang You
ODL
87
22
0
05 Jul 2023
Review helps learn better: Temporal Supervised Knowledge Distillation
Dongwei Wang
Zhi Han
Yanmei Wang
Xi’ai Chen
Baichen Liu
Yandong Tang
148
1
0
03 Jul 2023
OSP: Boosting Distributed Model Training with 2-stage Synchronization
Zixuan Chen
Lei Shi
Xuandong Liu
Jiahui Li
Sen Liu
Yang Xu
105
4
0
29 Jun 2023
Towards a Better Theoretical Understanding of Independent Subnetwork Training
Egor Shulgin
Peter Richtárik
AI4CE
108
6
0
28 Jun 2023
Separable Physics-Informed Neural Networks
Junwoo Cho
Seungtae Nam
Hyunmo Yang
S. Yun
Youngjoon Hong
Eunbyung Park
PINN
AI4CE
88
47
0
28 Jun 2023
Previous
1
2
3
...
6
7
8
...
40
41
42
Next