Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.02677
Cited By
v1
v2 (latest)
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"
50 / 2,054 papers shown
Title
PETScML: Second-order solvers for training regression problems in Scientific Machine Learning
Stefano Zampini
Umberto Zerbinati
George Turkyyiah
David E. Keyes
67
5
0
18 Mar 2024
Embedded Named Entity Recognition using Probing Classifiers
Nicholas Popovic
Michael Färber
85
1
0
18 Mar 2024
A Selective Review on Statistical Methods for Massive Data Computation: Distributed Computing, Subsampling, and Minibatch Techniques
Xuetong Li
Yuan Gao
Hong Chang
Danyang Huang
Yingying Ma
...
Ke Xu
Jing Zhou
Xuening Zhu
Yingqiu Zhu
Hansheng Wang
68
9
0
17 Mar 2024
DiPaCo: Distributed Path Composition
Arthur Douillard
Qixuang Feng
Andrei A. Rusu
A. Kuncoro
Yani Donchev
Rachita Chhaparia
Ionel Gog
MarcÁurelio Ranzato
Jiajun Shen
Arthur Szlam
MoE
86
3
0
15 Mar 2024
Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts
Byeongjun Park
Hyojun Go
Jin-Young Kim
Sangmin Woo
Seokil Ham
Changick Kim
DiffM
MoE
106
13
0
14 Mar 2024
Simple and Scalable Strategies to Continually Pre-train Large Language Models
Adam Ibrahim
Benjamin Thérien
Kshitij Gupta
Mats L. Richter
Quentin Anthony
Timothée Lesort
Eugene Belilovsky
Irina Rish
KELM
CLL
109
63
0
13 Mar 2024
Cyclic Data Parallelism for Efficient Parallelism of Deep Neural Networks
Louis Fournier
Edouard Oyallon
94
0
0
13 Mar 2024
Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons
Simon Dufort-Labbé
P. DÓro
Evgenii Nikishin
Razvan Pascanu
Pierre-Luc Bacon
A. Baratin
111
1
0
12 Mar 2024
Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System
Hongsun Jang
Jaeyong Song
Jaewon Jung
Jaeyoung Park
Youngsok Kim
Jinho Lee
43
16
0
11 Mar 2024
Tune without Validation: Searching for Learning Rate and Weight Decay on Training Sets
Lorenzo Brigato
Stavroula Mougiakakou
69
0
0
08 Mar 2024
DPAdapter: Improving Differentially Private Deep Learning through Noise Tolerance Pre-training
Zihao Wang
Rui Zhu
Dongruo Zhou
Zhikun Zhang
John C. Mitchell
Haixu Tang
Xiaofeng Wang
AAML
80
6
0
05 Mar 2024
HeAR -- Health Acoustic Representations
Sebastien Baur
Zaid Nabulsi
Wei-Hung Weng
Jake Garrison
Louis Blankemeier
...
Shwetak N. Patel
S. Shetty
Shruthi Prabhakara
Monde Muyoyeta
Diego Ardila
LM&MA
55
14
0
04 Mar 2024
A Tutorial on the Pretrain-Finetune Paradigm for Natural Language Processing
Yu Wang
Wen Qu
78
0
0
04 Mar 2024
Better Schedules for Low Precision Training of Deep Neural Networks
Cameron R. Wolfe
Anastasios Kyrillidis
74
1
0
04 Mar 2024
FCDS: Fusing Constituency and Dependency Syntax into Document-Level Relation Extraction
Xudong Zhu
Zhao Kang
Bei Hui
65
3
0
04 Mar 2024
Towards Calibrated Deep Clustering Network
Yuheng Jia
Jianhong Cheng
Hui Liu
Junhui Hou
UQCV
136
1
0
04 Mar 2024
Leveraging AI Predicted and Expert Revised Annotations in Interactive Segmentation: Continual Tuning or Full Training?
Tiezheng Zhang
Xiaoxi Chen
Chongyu Qu
Alan Yuille
Zongwei Zhou
CLL
106
5
0
29 Feb 2024
Batch size invariant Adam
Xi Wang
Laurence Aitchison
87
2
0
29 Feb 2024
Learning to Deliver: a Foundation Model for the Montreal Capacitated Vehicle Routing Problem
Samuel J. K. Chin
Matthias Winkenbach
Akash Srivastava
59
0
0
28 Feb 2024
Stable LM 2 1.6B Technical Report
Marco Bellagente
J. Tow
Dakota Mahan
Duy Phung
Maksym Zhuravinskyi
...
Paulo Rocha
Harry Saini
H. Teufel
Niccoló Zanichelli
Carlos Riquelme
OSLM
104
58
0
27 Feb 2024
TaxDiff: Taxonomic-Guided Diffusion Model for Protein Sequence Generation
Zongying Lin
Hao Li
Liuzhenghao Lv
Lin Bin
Junwu Zhang
Calvin Yu-Chian Chwn
Li Yuan
Tian Yonghong
76
3
0
27 Feb 2024
Think Big, Generate Quick: LLM-to-SLM for Fast Autoregressive Decoding
Benjamin Bergner
Andrii Skliar
Amelie Royer
Tijmen Blankevoort
Yuki Markus Asano
B. Bejnordi
101
7
0
26 Feb 2024
LiMAML: Personalization of Deep Recommender Models via Meta Learning
Ruofan Wang
Prakruthi Prabhakar
Gaurav Srivastava
Tianqi Wang
Zeinab S. Jalali
...
Divya Venugopalan
Aman Gupta
Fedor Borisyuk
S. Keerthi
A. Muralidharan
98
0
0
23 Feb 2024
Formal Definitions and Performance Comparison of Consistency Models for Parallel File Systems
Chen Wang
Kathryn M. Mohror
Marc Snir
34
1
0
21 Feb 2024
Random Aggregate Beamforming for Over-the-Air Federated Learning in Large-Scale Networks
Chunmei Xu
Shengheng Liu
Yongming Huang
Björn E. Ottersten
Dusist Niyato
FedML
82
3
0
20 Feb 2024
A Touch, Vision, and Language Dataset for Multimodal Alignment
Letian Fu
Gaurav Datta
Huang Huang
Will Panitch
Jaimyn Drake
Joseph Ortiz
Mustafa Mukadam
Mike Lambeta
Roberto Calandra
Ken Goldberg
VLM
91
43
0
20 Feb 2024
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema
Junru Lu
Siyu An
Min Zhang
Yulan He
Di Yin
Xing Sun
127
2
0
19 Feb 2024
AdAdaGrad: Adaptive Batch Size Schemes for Adaptive Gradient Methods
Tim Tsz-Kit Lau
Han Liu
Mladen Kolar
ODL
82
6
0
17 Feb 2024
On the Effectiveness of Machine Learning-based Call Graph Pruning: An Empirical Study
A. Mir
Mehdi Keshani
Sebastian Proksch
55
1
0
11 Feb 2024
Pre-training of Lightweight Vision Transformers on Small Datasets with Minimally Scaled Images
Jen Hong Tan
ViT
26
3
0
06 Feb 2024
DiffsFormer: A Diffusion Transformer on Stock Factor Augmentation
Yuan Gao
Haokun Chen
Xiang Wang
Zhicai Wang
Xue Wang
Jinyang Gao
Bolin Ding
74
6
0
05 Feb 2024
Revisiting the Power of Prompt for Visual Tuning
Yuzhu Wang
Lechao Cheng
Chaowei Fang
Dingwen Zhang
Manni Duan
Meng Wang
VLM
126
16
0
04 Feb 2024
BootsTAP: Bootstrapped Training for Tracking-Any-Point
Carl Doersch
Pauline Luc
Yi Yang
Dilara Gokay
Skanda Koppula
...
Joseph Heyward
Ignacio Rocco
Ross Goroshin
João Carreira
Andrew Zisserman
113
42
0
01 Feb 2024
CO2: Efficient Distributed Training with Full Communication-Computation Overlap
Weigao Sun
Zhen Qin
Weixuan Sun
Shidi Li
Dong Li
Xuyang Shen
Yu Qiao
Yiran Zhong
OffRL
122
11
0
29 Jan 2024
Do deep neural networks utilize the weight space efficiently?
Onur Can Koyun
B. U. Toreyin
54
0
0
26 Jan 2024
Deconstructing Denoising Diffusion Models for Self-Supervised Learning
Xinlei Chen
Zhuang Liu
Saining Xie
Kaiming He
DiffM
90
60
0
25 Jan 2024
Rethinking Patch Dependence for Masked Autoencoders
Letian Fu
Long Lian
Renhao Wang
Baifeng Shi
Xudong Wang
Adam Yala
Trevor Darrell
Alexei A. Efros
Ken Goldberg
142
16
0
25 Jan 2024
Exploring Simple Open-Vocabulary Semantic Segmentation
Zihang Lai
VLM
69
0
0
22 Jan 2024
LW-FedSSL: Resource-efficient Layer-wise Federated Self-supervised Learning
Ye Lin Tun
Chu Myaet Thwal
Le Quang Huy
Minh N. H. Nguyen
Choong Seon Hong
FedML
153
2
0
22 Jan 2024
Neglected Hessian component explains mysteries in Sharpness regularization
Yann N. Dauphin
Atish Agarwala
Hossein Mobahi
FAtt
118
7
0
19 Jan 2024
Hijacking Attacks against Neural Networks by Analyzing Training Data
Yunjie Ge
Qian Wang
Huayang Huang
Qi Li
Cong Wang
Chao Shen
Lingchen Zhao
Peipei Jiang
Zheng Fang
Shenyi Zhang
91
0
0
18 Jan 2024
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
M. A. D. L. Balaguer
Vinamra Benara
Renato Luiz de Freitas Cunha
Roberto de M. Estevao Filho
Todd Hendry
...
Morris Sharp
B. Silva
Swati Sharma
Vijay Aski
Ranveer Chandra
FaML
117
92
0
16 Jan 2024
Collaboratively Self-supervised Video Representation Learning for Action Recognition
Jie Zhang
Zhifan Wan
Lanqing Hu
Stephen Lin
Shuzhe Wu
Shiguang Shan
TTA
163
1
0
15 Jan 2024
Application of 2D Homography for High Resolution Traffic Data Collection using CCTV Cameras
Linlin Zhang
Xiang Yu
Abdulateef Daud
Abdul Rashid Mussah
Y. Adu-Gyamfi
39
1
0
14 Jan 2024
Enhancing Contrastive Learning with Efficient Combinatorial Positive Pairing
Jaeill Kim
Duhun Hwang
Eunjung Lee
Jangwon Suh
Jimyeong Kim
Wonjong Rhee
51
0
0
11 Jan 2024
Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation
Xiyi Chen
Marko Mihajlovic
Shaofei Wang
Sergey Prokudin
Siyu Tang
171
11
0
09 Jan 2024
NeRFmentation: NeRF-based Augmentation for Monocular Depth Estimation
Casimir Feldmann
Niall Siegenheim
Nikolas Hars
Lovro Rabuzin
Mert Ertugrul
Luca Wolfart
Marc Pollefeys
Z. Bauer
Martin R. Oswald
67
4
0
08 Jan 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
206
381
0
05 Jan 2024
Ravnest: Decentralized Asynchronous Training on Heterogeneous Devices
A. Menon
Unnikrishnan Menon
Kailash Ahirwar
60
1
0
03 Jan 2024
Detection-based Intermediate Supervision for Visual Question Answering
Yuhang Liu
Daowan Peng
Wei Wei
Yuanyuan Fu
Wenfeng Xie
Dangyang Chen
57
2
0
26 Dec 2023
Previous
1
2
3
4
5
6
...
40
41
42
Next