Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.02677
Cited By
v1
v2 (latest)
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"
50 / 2,054 papers shown
Title
IQAGPT: Image Quality Assessment with Vision-language and ChatGPT Models
Zhihao Chen
Bin Hu
Chuang Niu
Tao Chen
Yuxin Li
Hongming Shan
Ge Wang
LM&MA
MLLM
66
4
0
25 Dec 2023
A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties
Junfei Xiao
Ziqi Zhou
Wenxuan Li
Shiyi Lan
Jieru Mei
Zhiding Yu
Alan Yuille
Yuyin Zhou
Cihang Xie
VLM
58
1
0
21 Dec 2023
Bootstrap Masked Visual Modeling via Hard Patches Mining
Haochen Wang
Junsong Fan
Yuxi Wang
Kaiyou Song
Tiancai Wang
Xiangyu Zhang
Zhaoxiang Zhang
81
5
0
21 Dec 2023
On the Role of Server Momentum in Federated Learning
Jianhui Sun
Xidong Wu
Heng-Chiao Huang
Aidong Zhang
FedML
118
11
0
19 Dec 2023
SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery
Xin Guo
Jiangwei Lao
Bo Dang
Yingying Zhang
Lei Yu
...
Jian Wang
Jingdong Chen
Ming Yang
Yongjun Zhang
Yansheng Li
154
129
0
15 Dec 2023
Mini-batch Gradient Descent with Buffer
Haobo Qi
Du Huang
Yingqiu Zhu
Danyang Huang
Hansheng Wang
45
1
0
14 Dec 2023
Labels Need Prompts Too: Mask Matching for Natural Language Understanding Tasks
Bo Li
Wei Ye
Quan-ding Wang
Wen Zhao
Shikun Zhang
VLM
72
2
0
14 Dec 2023
NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker Verification
Hyunjun Heo
U.H Shin
Ran Lee
YoungJu Cheon
Hyung-Min Park
55
12
0
14 Dec 2023
PAD: Self-Supervised Pre-Training with Patchwise-Scale Adapter for Infrared Images
Tao Zhang
Kun Ding
Jinyong Wen
Yu Xiong
Zeyu Zhang
Shiming Xiang
Chunhong Pan
55
3
0
13 Dec 2023
MCFNet: Multi-scale Covariance Feature Fusion Network for Real-time Semantic Segmentation
Xiaojie Fang
Xingguo Song
Xiangyin Meng
Xu Fang
Sheng Jin
45
0
0
12 Dec 2023
One-Step Diffusion Distillation via Deep Equilibrium Models
Zhengyang Geng
Ashwini Pokle
Trevor Killeen
75
33
0
12 Dec 2023
4M: Massively Multimodal Masked Modeling
David Mizrahi
Roman Bachmann
Ouguzhan Fatih Kar
Teresa Yeo
Mingfei Gao
Afshin Dehghan
Amir Zamir
MLLM
99
74
0
11 Dec 2023
Counterfactual World Modeling for Physical Dynamics Understanding
Rahul Venkatesh
Honglin Chen
Kevin T. Feigelis
Daniel M. Bear
Khaled Jedoui
...
Wanhee Lee
Sherry Liu
Kevin A. Smith
Judith E. Fan
Daniel L. K. Yamins
VGen
85
2
0
11 Dec 2023
Tenplex: Dynamic Parallelism for Deep Learning using Parallelizable Tensor Collections
Marcel Wagenlander
Guo Li
Bo Zhao
Kai Zou
Peter R. Pietzuch
96
7
0
08 Dec 2023
LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures
Vimal Thilak
Chen Huang
Omid Saremi
Laurent Dinh
Hanlin Goh
Preetum Nakkiran
Josh Susskind
Etai Littwin
109
10
0
07 Dec 2023
The Landscape of Modern Machine Learning: A Review of Machine, Distributed and Federated Learning
Omer Subasi
Oceane Bel
Joseph Manzano
Kevin J. Barker
FedML
OOD
PINN
91
2
0
05 Dec 2023
MoSA: Mixture of Sparse Adapters for Visual Efficient Tuning
Qizhe Zhang
Bocheng Zou
Ruichuan An
Jiaming Liu
Shanghang Zhang
MoE
91
3
0
05 Dec 2023
Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling
Shentong Mo
Pedro Morgado
80
14
0
02 Dec 2023
Improving Normalization with the James-Stein Estimator
Seyedalireza Khoshsirat
Chandra Kambhamettu
73
5
0
01 Dec 2023
PipeOptim: Ensuring Effective 1F1B Schedule with Optimizer-Dependent Weight Prediction
Lei Guan
Dongsheng Li
Jiye Liang
Wenjian Wang
Wenjian Wang
Xicheng Lu
152
1
0
01 Dec 2023
Generalisable Agents for Neural Network Optimisation
Kale-ab Tessera
C. Tilbury
Sasha Abramowitz
Ruan de Kock
Omayma Mahjoub
Benjamin Rosman
Sara Hooker
Arnu Pretorius
AI4CE
74
0
0
30 Nov 2023
Zero Bubble Pipeline Parallelism
Penghui Qi
Xinyi Wan
Guangxing Huang
Min Lin
72
25
0
30 Nov 2023
Towards Higher Ranks via Adversarial Weight Pruning
Yuchuan Tian
Hanting Chen
Tianyu Guo
Chao Xu
Yunhe Wang
63
2
0
29 Nov 2023
Deep Learning for Time Series Classification of Parkinson's Disease Eye Tracking Data
Gonzalo Uribarri
Simon Ekman von Huth
Josefine Waldthaler
Per Svenningsson
Erik Fransén
78
7
0
28 Nov 2023
Tell2Design: A Dataset for Language-Guided Floor Plan Generation
Sicong Leng
Yangqiaoyu Zhou
Mohammed Haroon Dupty
W. Lee
Sam Joyce
Wei Lu
3DV
67
15
0
27 Nov 2023
Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults
Prin Phunyaphibarn
Junghyun Lee
Bohan Wang
Huishuai Zhang
Chulhee Yun
82
1
0
25 Nov 2023
Self-Supervised Disentanglement by Leveraging Structure in Data Augmentations
Cian Eastwood
Julius von Kügelgen
Linus Ericsson
Diane Bouchacourt
Pascal Vincent
Bernhard Schölkopf
Mark Ibrahim
107
11
0
15 Nov 2023
Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Naoki Sato
Hideaki Iiduka
78
3
0
15 Nov 2023
Efficient Rotation Invariance in Deep Neural Networks through Artificial Mental Rotation
Lukas Tuggener
Thilo Stadelmann
Jürgen Schmidhuber
OOD
54
1
0
14 Nov 2023
A Coefficient Makes SVRG Effective
Yida Yin
Zhiqiu Xu
Zhiyuan Li
Trevor Darrell
Zhuang Liu
89
1
0
09 Nov 2023
Robust Fine-Tuning of Vision-Language Models for Domain Generalization
Kevin Vogt-Lowell
Noah Lee
Theodoros Tsiligkaridis
Marc Vaillant
VLM
85
4
0
03 Nov 2023
Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training
Yipeng Gao
Zeyu Wang
Wei-Shi Zheng
Cihang Xie
Yuyin Zhou
3DPC
150
10
0
03 Nov 2023
De-Diffusion Makes Text a Strong Cross-Modal Interface
Chen Wei
Chenxi Liu
Siyuan Qiao
Zhishuai Zhang
Alan Yuille
Jiahui Yu
VLM
DiffM
103
11
0
01 Nov 2023
Generalization Bounds for Label Noise Stochastic Gradient Descent
Jung Eun Huh
Patrick Rebeschini
61
1
0
01 Nov 2023
Self-Supervised Pre-Training for Precipitation Post-Processor
Sojung An
Junha Lee
Jiyeon Jang
Inchae Na
Wooyeon Park
Sujeong You
AI4Cl
74
1
0
31 Oct 2023
Weakly-Supervised Surgical Phase Recognition
Roy Hirsch
Regev Cohen
Mathilde Caron
Tomer Golany
Daniel Freedman
Ehud Rivlin
38
1
0
26 Oct 2023
TiC-CLIP: Continual Training of CLIP Models
Saurabh Garg
Mehrdad Farajtabar
Hadi Pouransari
Raviteja Vemulapalli
Sachin Mehta
Oncel Tuzel
Vaishaal Shankar
Fartash Faghri
VLM
CLIP
121
31
0
24 Oct 2023
CalibrationPhys: Self-supervised Video-based Heart and Respiratory Rate Measurements by Calibrating Between Multiple Cameras
Yusuke Akamatsu
Terumi Umematsu
Hitoshi Imaoka
71
7
0
23 Oct 2023
From the Pursuit of Universal AGI Architecture to Systematic Approach to Heterogenous AGI: Addressing Alignment, Energy, & AGI Grand Challenges
Eren Kurshan
110
0
0
23 Oct 2023
A Quadratic Synchronization Rule for Distributed Deep Learning
Xinran Gu
Kaifeng Lyu
Sanjeev Arora
Jingzhao Zhang
Longbo Huang
92
1
0
22 Oct 2023
Zone Evaluation: Revealing Spatial Bias in Object Detection
Zhaohui Zheng
Yuming Chen
Qibin Hou
Xiang Li
Ping Wang
Ming-Ming Cheng
ObjD
114
4
0
20 Oct 2023
Frozen Transformers in Language Models Are Effective Visual Encoder Layers
Ziqi Pang
Ziyang Xie
Yunze Man
Yu-Xiong Wang
144
27
0
19 Oct 2023
Cooperative Minibatching in Graph Neural Networks
M. F. Balin
Dominique LaSalle
Ümit V. Çatalyürek
GNN
69
1
0
19 Oct 2023
Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression
Adam Block
Dylan J. Foster
Akshay Krishnamurthy
Max Simchowitz
Cyril Zhang
77
7
0
17 Oct 2023
Llemma: An Open Language Model For Mathematics
Zhangir Azerbayev
Hailey Schoelkopf
Keiran Paster
Marco Dos Santos
Stephen Marcus McAleer
Albert Q. Jiang
Jia Deng
Stella Biderman
Sean Welleck
CLL
126
303
0
16 Oct 2023
KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training
Truong Thao Nguyen
Balazs Gerofi
Edgar Josafat Martinez-Noriega
Franccois Trahay
Mohamed Wahib
56
1
0
16 Oct 2023
New Advances in Body Composition Assessment with ShapedNet: A Single Image Deep Regression Approach
N. M. Nascimento
Pedro Cavalcante de Sousa Junior
Pedro Yuri Rodrigues Nunes
S. P. P. Silva
L. L. Loureiro
V. Z. Bittencourt
Valden Capistrano Junior
Pedro Pedrosa Rebouccas Filho
3DH
16
0
0
15 Oct 2023
3D Understanding of Deformable Linear Objects: Datasets and Transferability Benchmark
Bare Luka vZagar
Tim Hertel
Mingyu Liu
Ekim Yurtsever
Alois Knoll
3DPC
60
0
0
13 Oct 2023
AutoFHE: Automated Adaption of CNNs for Efficient Evaluation over FHE
Wei Ao
Vishnu Boddeti
AAML
82
20
0
12 Oct 2023
Context-Enhanced Detector For Building Detection From Remote Sensing Images
Ziyue Huang
Mingming Zhang
Qingjie Liu
Wei Wang
Zhe Dong
Yunhong Wang
68
1
0
11 Oct 2023
Previous
1
2
3
...
5
6
7
...
40
41
42
Next