Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.06679
Cited By
Neural Optimizer Equation, Decay Function, and Learning Rate Schedule Joint Evolution
10 April 2024
Brandon Morgan
Dean Frederick Hougen
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Neural Optimizer Equation, Decay Function, and Learning Rate Schedule Joint Evolution"
29 / 29 papers shown
Title
Neural Loss Function Evolution for Large-Scale Image Classifier Convolutional Neural Networks
Brandon Morgan
Dean Frederick Hougen
47
2
0
30 Jan 2024
EfficientNetV2: Smaller Models and Faster Training
Mingxing Tan
Quoc V. Le
EgoV
122
2,705
0
01 Apr 2021
Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search
Peidong Liu
Gengwei Zhang
Bochao Wang
Hang Xu
Xiaodan Liang
Yong Jiang
Zhenguo Li
69
28
0
09 Feb 2021
AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients
Juntang Zhuang
Tommy M. Tang
Yifan Ding
S. Tatikonda
Nicha Dvornek
X. Papademetris
James S. Duncan
ODL
165
517
0
15 Oct 2020
Stochastic Normalized Gradient Descent with Momentum for Large-Batch Training
Shen-Yi Zhao
Chang-Wei Shi
Yin-Peng Xie
Wu-Jun Li
ODL
66
10
0
28 Jul 2020
Evolving Normalization-Activation Layers
Hanxiao Liu
Andrew Brock
Karen Simonyan
Quoc V. Le
91
80
0
06 Apr 2020
Evolutionary Optimization of Deep Learning Activation Functions
G. Bingham
William Macke
Risto Miikkulainen
ODL
48
50
0
17 Feb 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
520
42,449
0
03 Dec 2019
Demon: Improved Neural Network Training with Momentum Decay
John Chen
Cameron R. Wolfe
Zhaoqi Li
Anastasios Kyrillidis
ODL
66
15
0
11 Oct 2019
RandAugment: Practical automated data augmentation with a reduced search space
E. D. Cubuk
Barret Zoph
Jonathon Shlens
Quoc V. Le
MQ
234
3,490
0
30 Sep 2019
Learning an Adaptive Learning Rate Schedule
Zhen Xu
Andrew M. Dai
Jonas Kemp
Luke Metz
62
62
0
20 Sep 2019
AutoML: A Survey of the State-of-the-Art
Xin He
Kaiyong Zhao
Xiaowen Chu
129
1,457
0
02 Aug 2019
Calibrating the Adaptive Learning Rate to Improve Convergence of ADAM
Qianqian Tong
Guannan Liang
J. Bi
85
7
0
02 Aug 2019
Why gradient clipping accelerates training: A theoretical justification for adaptivity
J.N. Zhang
Tianxing He
S. Sra
Ali Jadbabaie
76
464
0
28 May 2019
On the Convergence Proof of AMSGrad and a New Version
Phuong T. Tran
L. T. Phong
ODL
63
87
0
07 Apr 2019
A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation
Akhilesh Deepak Gotmare
N. Keskar
Caiming Xiong
R. Socher
ODL
71
276
0
29 Oct 2018
Quasi-hyperbolic momentum and Adam for deep learning
Jerry Ma
Denis Yarats
ODL
142
130
0
16 Oct 2018
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
Noam M. Shazeer
Mitchell Stern
ODL
76
1,048
0
11 Apr 2018
Aggregated Momentum: Stability Through Passive Damping
James Lucas
Shengyang Sun
R. Zemel
Roger C. Grosse
59
68
0
01 Apr 2018
Regularized Evolution for Image Classifier Architecture Search
Esteban Real
A. Aggarwal
Yanping Huang
Quoc V. Le
160
3,031
0
05 Feb 2018
Neural Optimizer Search with Reinforcement Learning
Irwan Bello
Barret Zoph
Vijay Vasudevan
Quoc V. Le
ODL
62
385
0
21 Sep 2017
Regularizing and Optimizing LSTM Language Models
Stephen Merity
N. Keskar
R. Socher
166
1,096
0
07 Aug 2017
Learning Transferable Architectures for Scalable Image Recognition
Barret Zoph
Vijay Vasudevan
Jonathon Shlens
Quoc V. Le
177
5,605
0
21 Jul 2017
YellowFin and the Art of Momentum Tuning
Jian Zhang
Ioannis Mitliagkas
ODL
61
108
0
12 Jun 2017
Neural Architecture Search with Reinforcement Learning
Barret Zoph
Quoc V. Le
468
5,373
0
05 Nov 2016
Cyclical Learning Rates for Training Neural Networks
L. Smith
ODL
210
2,529
0
03 Jun 2015
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
463
43,305
0
11 Feb 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.9K
150,115
0
22 Dec 2014
ADADELTA: An Adaptive Learning Rate Method
Matthew D. Zeiler
ODL
155
6,625
0
22 Dec 2012
1