ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.02677
  4. Cited By
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
v1v2 (latest)

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

8 June 2017
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
    3DH
ArXiv (abs)PDFHTML

Papers citing "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"

50 / 2,054 papers shown
Title
GroupNL: Low-Resource and Robust CNN Design over Cloud and Device
GroupNL: Low-Resource and Robust CNN Design over Cloud and Device
Chuntao Ding
Jianhang Xie
Junna Zhang
Salman Raza
Shangguang Wang
Jiannong Cao
OOD
42
0
0
14 Jun 2025
Is your batch size the problem? Revisiting the Adam-SGD gap in language modeling
Is your batch size the problem? Revisiting the Adam-SGD gap in language modeling
Teodora Srećković
Jonas Geiping
Antonio Orvieto
MoE
26
0
0
14 Jun 2025
NysAct: A Scalable Preconditioned Gradient Descent using Nystrom Approximation
Hyunseok Seung
Jaewoo Lee
Hyunsuk Ko
ODL
30
0
0
10 Jun 2025
An Adaptive Method Stabilizing Activations for Enhanced Generalization
Hyunseok Seung
Jaewoo Lee
Hyunsuk Ko
ODL
28
0
0
10 Jun 2025
DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval
Leqi Shen
Guoqiang Gong
Tianxiang Hao
Tao He
Yifeng Zhang
Pengzhang Liu
Sicheng Zhao
Jungong Han
Guiguang Ding
24
0
0
10 Jun 2025
Integration of Old and New Knowledge for Generalized Intent Discovery: A Consistency-driven Prototype-Prompting Framework
Integration of Old and New Knowledge for Generalized Intent Discovery: A Consistency-driven Prototype-Prompting Framework
Xiao Wei
Xiaobao Wang
Ning Zhuang
Chenyang Wang
L. Wang
Jianwu Dang
19
0
0
10 Jun 2025
EgoM2P: Egocentric Multimodal Multitask Pretraining
EgoM2P: Egocentric Multimodal Multitask Pretraining
Gen Li
Yutong Chen
Yiqian Wu
Kaifeng Zhao
Marc Pollefeys
Siyu Tang
EgoVVLM
38
0
0
09 Jun 2025
BiMa: Towards Biases Mitigation for Text-Video Retrieval via Scene Element Guidance
BiMa: Towards Biases Mitigation for Text-Video Retrieval via Scene Element Guidance
Huy Le
Nhat Chung
Tung Kieu
A. Nguyen
Ngan Le
70
0
0
04 Jun 2025
No Audiogram: Leveraging Existing Scores for Personalized Speech Intelligibility Prediction
No Audiogram: Leveraging Existing Scores for Personalized Speech Intelligibility Prediction
Haoshuai Zhou
Changgeng Mo
Boxuan Cao
Linkai Li
Shan Xiang Wang
20
0
0
31 May 2025
SST: Self-training with Self-adaptive Thresholding for Semi-supervised Learning
SST: Self-training with Self-adaptive Thresholding for Semi-supervised Learning
Shuai Zhao
Heyan Huang
Xinge Li
Xiaokang Chen
Rui Wang
35
0
0
31 May 2025
PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations
PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations
Benjamin Holzschuh
Qiang Liu
Georg Kohl
Nils Thuerey
AI4CE
46
1
0
30 May 2025
Chameleon: A MatMul-Free Temporal Convolutional Network Accelerator for End-to-End Few-Shot and Continual Learning from Sequential Data
Chameleon: A MatMul-Free Temporal Convolutional Network Accelerator for End-to-End Few-Shot and Continual Learning from Sequential Data
Douwe den Blanken
Charlotte Frenkel
30
0
0
30 May 2025
How far away are truly hyperparameter-free learning algorithms?
How far away are truly hyperparameter-free learning algorithms?
Priya Kasimbeg
Vincent Roulet
Naman Agarwal
Sourabh Medapati
Fabian Pedregosa
Atish Agarwala
George E. Dahl
22
0
0
29 May 2025
Hybrid Batch Normalisation: Resolving the Dilemma of Batch Normalisation in Federated Learning
Hongyao Chen
Tianyang Xu
Xiaojun Wu
Josef Kittler
FedML
23
0
0
28 May 2025
Understand, Think, and Answer: Advancing Visual Reasoning with Large Multimodal Models
Understand, Think, and Answer: Advancing Visual Reasoning with Large Multimodal Models
Yufei Zhan
Hongyin Zhao
Yousong Zhu
Shurong Zheng
Fan Yang
Ming Tang
Jinqiao Wang
VLMLRM
54
0
0
27 May 2025
Variational Deep Learning via Implicit Regularization
Variational Deep Learning via Implicit Regularization
Jonathan Wenger
Beau Coker
Juraj Marusic
John P. Cunningham
OODUQCVBDL
56
0
0
26 May 2025
Towards Anonymous Neural Network Inference
Towards Anonymous Neural Network Inference
Liao Peiyuan
37
0
0
23 May 2025
A Two-Stage Data Selection Framework for Data-Efficient Model Training on Edge Devices
A Two-Stage Data Selection Framework for Data-Efficient Model Training on Edge Devices
Chen Gong
Rui Xing
Zhenzhe Zheng
Fan Wu
61
0
0
22 May 2025
Mean Flows for One-step Generative Modeling
Mean Flows for One-step Generative Modeling
Zhengyang Geng
Mingyang Deng
Xingjian Bai
J. Zico Kolter
Kaiming He
DiffM
83
2
0
19 May 2025
Video-GPT via Next Clip Diffusion
Video-GPT via Next Clip Diffusion
Shaobin Zhuang
Zhipeng Huang
Ying Zhang
Fangyikang Wang
Canmiao Fu
Binxin Yang
Chong Sun
Chen Li
Yali Wang
DiffMVGen
241
0
0
18 May 2025
Search-TTA: A Multimodal Test-Time Adaptation Framework for Visual Search in the Wild
Search-TTA: A Multimodal Test-Time Adaptation Framework for Visual Search in the Wild
Derek Ming Siang Tan
Shailesh
Boyang Liu
Alok Raj
Qi Xuan Ang
...
Tanishq Duhan
Jimmy Chiun
Yuhong Cao
Florian Shkurti
Guillaume Sartoretti
55
0
0
16 May 2025
Real2Render2Real: Scaling Robot Data Without Dynamics Simulation or Robot Hardware
Real2Render2Real: Scaling Robot Data Without Dynamics Simulation or Robot Hardware
Justin Yu
Letian Fu
Huang Huang
Karim El-Refai
Rares Andrei Ambrus
Richard Cheng
Muhammad Zubair Irshad
Ken Goldberg
77
1
0
14 May 2025
Dynamic Snake Upsampling Operater and Boundary-Skeleton Weighted Loss for Tubular Structure Segmentation
Dynamic Snake Upsampling Operater and Boundary-Skeleton Weighted Loss for Tubular Structure Segmentation
Yiqi Chen
Ganghai Huang
Sheng Zhang
Jianglin Dai
225
0
0
13 May 2025
Enhancing the Learning Experience: Using Vision-Language Models to Generate Questions for Educational Videos
Enhancing the Learning Experience: Using Vision-Language Models to Generate Questions for Educational Videos
Markos Stamatakis
Joshua Berger
Christian Wartena
Ralph Ewerth
Anett Hoppe
AI4Ed
120
0
0
03 May 2025
Accelerating Deep Neural Network Training via Distributed Hybrid Order Optimization
Accelerating Deep Neural Network Training via Distributed Hybrid Order Optimization
Shunxian Gu
Chaoqun You
Bangbang Ren
Lailong Luo
Junxu Xia
Deke Guo
72
0
0
02 May 2025
A multi-scale vision transformer-based multimodal GeoAI model for mapping Arctic permafrost thaw
A multi-scale vision transformer-based multimodal GeoAI model for mapping Arctic permafrost thaw
Wenwen Li
Chia-Yu Hsu
Sizhe Wang
Zhining Gu
Yili Yang
Brendan M. Rogers
A. Liljedahl
102
0
0
23 Apr 2025
Enhancing Reinforcement learning in 3-Dimensional Hydrophobic-Polar Protein Folding Model with Attention-based layers
Enhancing Reinforcement learning in 3-Dimensional Hydrophobic-Polar Protein Folding Model with Attention-based layers
Peizheng Liu
Hitoshi Iba
67
0
0
22 Apr 2025
Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals
Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals
Stefan Stojanov
David Wendt
Seungwoo Kim
R. Venkatesh
Kevin T. Feigelis
Jiajun Wu
Daniel L. K. Yamins
SSL
99
0
0
25 Mar 2025
Fractal-IR: A Unified Framework for Efficient and Scalable Image Restoration
Fractal-IR: A Unified Framework for Efficient and Scalable Image Restoration
Yawei Li
Bin Ren
Christos Sakaridis
Rakesh Ranjan
Mengyuan Liu
N. Sebe
Ming-Hsuan Yang
Luca Benini
101
0
0
22 Mar 2025
OmniLearn: A Framework for Distributed Deep Learning over Heterogeneous Clusters
OmniLearn: A Framework for Distributed Deep Learning over Heterogeneous Clusters
S. Tyagi
Prateek Sharma
138
0
0
21 Mar 2025
SaMam: Style-aware State Space Model for Arbitrary Image Style Transfer
SaMam: Style-aware State Space Model for Arbitrary Image Style Transfer
Hongda Liu
Longguang Wang
Ye Zhang
Ziru Yu
Yulan Guo
Mamba
117
0
0
20 Mar 2025
LipShiFT: A Certifiably Robust Shift-based Vision Transformer
LipShiFT: A Certifiably Robust Shift-based Vision Transformer
Rohan Menon
Nicola Franco
Stephan Günnemann
80
0
0
18 Mar 2025
Quantum EigenGame for excited state calculation
Quantum EigenGame for excited state calculation
David Quiroga
Jason Han
Anastasios Kyrillidis
116
0
0
17 Mar 2025
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
Kairong Luo
Haodong Wen
Shengding Hu
Zhenbo Sun
Zhiyuan Liu
Maosong Sun
Kaifeng Lyu
Wenguang Chen
CLL
115
3
0
17 Mar 2025
Atlas: Multi-Scale Attention Improves Long Context Image Modeling
Atlas: Multi-Scale Attention Improves Long Context Image Modeling
Kumar Krishna Agrawal
Long Lian
Lu Liu
Natalia Harguindeguy
Boyi Li
Alexander Bick
Maggie Chung
Trevor Darrell
Adam Yala
ViT
83
0
0
16 Mar 2025
Self-Supervised Pretraining for Fine-Grained Plankton Recognition
Self-Supervised Pretraining for Fine-Grained Plankton Recognition
Joona Kareinen
T. Eerola
K. Kraft
L. Lensu
S. Suikkanen
Heikki Kälviäinen
SSL
496
0
0
14 Mar 2025
Panopticon: Advancing Any-Sensor Foundation Models for Earth Observation
Leonard Waldmann
Ando Shah
Yi Wang
Nils Lehmann
Adam J. Stewart
Zhitong Xiong
Xiao Xiang Zhu
Stefan Bauer
John Chuang
74
4
0
13 Mar 2025
Routing for Large ML Models
Ofir Cohen
Jose Yallouz Michael Schapira
Shahar Belkar
Tal Mizrahi
73
0
0
07 Mar 2025
ARINAR: Bi-Level Autoregressive Feature-by-Feature Generative Models
Qinyu Zhao
Stephen Gould
Liang Zheng
DiffMGANVGenVLM
101
1
0
04 Mar 2025
Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training
Paul Janson
Vaibhav Singh
Paria Mehrbod
Adam Ibrahim
Irina Rish
Eugene Belilovsky
Benjamin Thérien
CLL
130
1
0
04 Mar 2025
Syntactic Learnability of Echo State Neural Language Models at Scale
Ryo Ueda
Tatsuki Kuribayashi
Shunsuke Kando
Kentaro Inui
97
0
0
03 Mar 2025
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data
Haoxin Li
Boyang Li
CoGe
188
1
0
03 Mar 2025
PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization
PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization
Xinyi Wan
Penghui Qi
Guangxing Huang
Jialin Li
Min Lin
82
0
0
03 Mar 2025
VRM: Knowledge Distillation via Virtual Relation Matching
VRM: Knowledge Distillation via Virtual Relation Matching
W. Zhang
Fei Xie
Weidong Cai
Chao Ma
210
0
0
28 Feb 2025
Same accuracy, twice as fast: continuous training surpasses retraining from scratch
Same accuracy, twice as fast: continuous training surpasses retraining from scratch
Eli Verwimp
Guy Hacohen
Tinne Tuytelaars
OnRL
79
0
0
28 Feb 2025
Super-Resolution for Interferometric Imaging: Model Comparisons and Performance Analysis
Super-Resolution for Interferometric Imaging: Model Comparisons and Performance Analysis
Hasan Berkay Abdioglu
Rana Gursoy
Yagmur Isik
Ibrahim Cem Balci
Taha Unal
...
Mustafa Ismail Inal
Nehir Serin
Muhammed Furkan Kosar
G. B. Esmer
H. Uvet
79
0
0
24 Feb 2025
Vision-LSTM: xLSTM as Generic Vision Backbone
Vision-LSTM: xLSTM as Generic Vision Backbone
Benedikt Alkin
M. Beck
Korbinian Poppel
Sepp Hochreiter
Johannes Brandstetter
VLM
235
49
0
24 Feb 2025
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
Shane Bergsma
Nolan Dey
Gurpreet Gosal
Gavia Gray
Daria Soboleva
Joel Hestness
109
8
0
21 Feb 2025
On Memorization in Diffusion Models
On Memorization in Diffusion Models
Xiangming Gu
Chao Du
Tianyu Pang
Chongxuan Li
Min Lin
Ye Wang
DiffMTDI
347
55
0
21 Feb 2025
Learn2Mix: Training Neural Networks Using Adaptive Data Integration
Learn2Mix: Training Neural Networks Using Adaptive Data Integration
Shyam Venkatasubramanian
Vahid Tarokh
174
0
0
17 Feb 2025
1234...404142
Next