Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.05584
Cited By
v1
v2 (latest)
FourCastNeXt: Optimizing FourCastNet Training for Limited Compute
10 January 2024
Edison Guo
Maruf Ahmed
Yue Sun
Rui Yang
Harrison Cook
Tennessee Leeuwenburg
Ben Evans
Re-assign community
ArXiv (abs)
PDF
HTML
Github (16★)
Papers citing
"FourCastNeXt: Optimizing FourCastNet Training for Limited Compute"
15 / 15 papers shown
Title
FourCastNet: Accelerating Global High-Resolution Weather Forecasting using Adaptive Fourier Neural Operators
Thorsten Kurth
Shashank Subramanian
P. Harrington
Jaideep Pathak
Morteza Mardani
D. Hall
Andrea Miele
K. Kashinath
Anima Anandkumar
AI4Cl
86
193
0
08 Aug 2022
EfficientFormer: Vision Transformers at MobileNet Speed
Yanyu Li
Geng Yuan
Yang Wen
Eric Hu
Georgios Evangelidis
Sergey Tulyakov
Yanzhi Wang
Jian Ren
ViT
95
369
0
02 Jun 2022
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
529
6,293
0
05 Apr 2022
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
208
1,987
0
29 Mar 2022
DeepNet: Scaling Transformers to 1,000 Layers
Hongyu Wang
Shuming Ma
Li Dong
Shaohan Huang
Dongdong Zhang
Furu Wei
MoE
AI4CE
131
162
0
01 Mar 2022
FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators
Jaideep Pathak
Shashank Subramanian
P. Harrington
S. Raja
Ashesh Chattopadhyay
...
Zong-Yi Li
Kamyar Azizzadenesheli
Pedram Hassanzadeh
K. Kashinath
Anima Anandkumar
AI4Cl
240
709
0
22 Feb 2022
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
186
5,213
0
10 Jan 2022
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
467
21,603
0
25 Mar 2021
Curriculum Learning: A Survey
Petru Soviany
Radu Tudor Ionescu
Paolo Rota
N. Sebe
ODL
157
359
0
25 Jan 2021
On Layer Normalization in the Transformer Architecture
Ruibin Xiong
Yunchang Yang
Di He
Kai Zheng
Shuxin Zheng
Chen Xing
Huishuai Zhang
Yanyan Lan
Liwei Wang
Tie-Yan Liu
AI4CE
145
996
0
12 Feb 2020
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Mingxing Tan
Quoc V. Le
3DV
MedIm
164
18,193
0
28 May 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
269
998
0
01 Apr 2019
An Empirical Model of Large-Batch Training
Sam McCandlish
Jared Kaplan
Dario Amodei
OpenAI Dota Team
69
280
0
14 Dec 2018
Don't Decay the Learning Rate, Increase the Batch Size
Samuel L. Smith
Pieter-Jan Kindermans
Chris Ying
Quoc V. Le
ODL
107
996
0
01 Nov 2017
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
1.2K
20,900
0
17 Apr 2017
1