ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1608.03983
  4. Cited By
SGDR: Stochastic Gradient Descent with Warm Restarts

SGDR: Stochastic Gradient Descent with Warm Restarts

13 August 2016
I. Loshchilov
Frank Hutter
    ODL
ArXivPDFHTML

Papers citing "SGDR: Stochastic Gradient Descent with Warm Restarts"

50 / 4,280 papers shown
Title
Mutual Information Scaling and Expressive Power of Sequence Models
Mutual Information Scaling and Expressive Power of Sequence Models
Huitao Shen
20
18
0
10 May 2019
EENA: Efficient Evolution of Neural Architecture
EENA: Efficient Evolution of Neural Architecture
Hui Zhu
Zhulin An
Chuanguang Yang
Kaiqiang Xu
Erhu Zhao
Yongjun Xu
3DV
34
39
0
10 May 2019
Learning Representations for Predicting Future Activities
Learning Representations for Predicting Future Activities
Mohammadreza Zolfaghari
Özgün Çiçek
S. M. Ali
F. Mahdisoltani
Can Zhang
Thomas Brox
AI4TS
10
6
0
09 May 2019
Neural Architecture Refinement: A Practical Way for Avoiding Overfitting
  in NAS
Neural Architecture Refinement: A Practical Way for Avoiding Overfitting in NAS
Yangzhou Jiang
Cong Zhao
Zeyang Dou
Lei Pang
14
5
0
07 May 2019
Omni-Scale Feature Learning for Person Re-Identification
Omni-Scale Feature Learning for Person Re-Identification
Kaiyang Zhou
Yongxin Yang
Andrea Cavallaro
Tao Xiang
30
821
0
02 May 2019
Segmentation is All You Need
Segmentation is All You Need
Zehua Cheng
Yuxiang Wu
Zhenghua Xu
Thomas Lukasiewicz
Weiyan Wang
33
20
0
30 Apr 2019
Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the
  Limbo of Resources
Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources
Yanghua Peng
Hang Zhang
Yifei Ma
Tong He
Zhi-Li Zhang
Sheng Zha
Mu Li
28
23
0
26 Apr 2019
Analytical Moment Regularizer for Gaussian Robust Networks
Analytical Moment Regularizer for Gaussian Robust Networks
Modar Alfadly
Adel Bibi
Guohao Li
AAML
19
4
0
24 Apr 2019
Attention Augmented Convolutional Networks
Attention Augmented Convolutional Networks
Irwan Bello
Barret Zoph
Ashish Vaswani
Jonathon Shlens
Quoc V. Le
46
999
0
22 Apr 2019
Knowledge Distillation via Route Constrained Optimization
Knowledge Distillation via Route Constrained Optimization
Xiao Jin
Baoyun Peng
Yichao Wu
Yu Liu
Jiaheng Liu
Ding Liang
Junjie Yan
Xiaolin Hu
20
169
0
19 Apr 2019
EvalNorm: Estimating Batch Normalization Statistics for Evaluation
EvalNorm: Estimating Batch Normalization Statistics for Evaluation
Saurabh Singh
Abhinav Shrivastava
26
51
0
12 Apr 2019
Autoregressive Energy Machines
Autoregressive Energy Machines
C. Nash
Conor Durkan
31
55
0
11 Apr 2019
CondConv: Conditionally Parameterized Convolutions for Efficient
  Inference
CondConv: Conditionally Parameterized Convolutions for Efficient Inference
Brandon Yang
Gabriel Bender
Quoc V. Le
Jiquan Ngiam
MedIm
3DV
31
622
0
10 Apr 2019
Label Propagation for Deep Semi-supervised Learning
Label Propagation for Deep Semi-supervised Learning
Ahmet Iscen
Giorgos Tolias
Yannis Avrithis
Ondřej Chum
SSL
24
621
0
09 Apr 2019
Semi-Supervised Segmentation of Salt Bodies in Seismic Images using an
  Ensemble of Convolutional Neural Networks
Semi-Supervised Segmentation of Salt Bodies in Seismic Images using an Ensemble of Convolutional Neural Networks
Yauhen Babakhin
A. Sanakoyeu
Hirotoshi Kitamura
24
59
0
09 Apr 2019
Relational Action Forecasting
Relational Action Forecasting
Chen Sun
Abhinav Shrivastava
Carl Vondrick
Rahul Sukthankar
Kevin Patrick Murphy
Cordelia Schmid
36
79
0
08 Apr 2019
ASAP: Architecture Search, Anneal and Prune
ASAP: Architecture Search, Anneal and Prune
Asaf Noy
Niv Nayman
T. Ridnik
Nadav Zamir
Sivan Doveh
Itamar Friedman
Raja Giryes
Lihi Zelnik-Manor
33
102
0
08 Apr 2019
Video Classification with Channel-Separated Convolutional Networks
Video Classification with Channel-Separated Convolutional Networks
Du Tran
Heng Wang
Lorenzo Torresani
Matt Feiszli
3DV
22
581
0
04 Apr 2019
Few-shot brain segmentation from weakly labeled data with deep
  heteroscedastic multi-task networks
Few-shot brain segmentation from weakly labeled data with deep heteroscedastic multi-task networks
Richard McKinley
Michael Rebsamen
Raphael Meier
M. Reyes
C. Rummel
Roland Wiest
16
13
0
04 Apr 2019
Exploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image Recognition
Saining Xie
Alexander Kirillov
Ross B. Girshick
Kaiming He
22
364
0
02 Apr 2019
Meta-learning Convolutional Neural Architectures for Multi-target
  Concrete Defect Classification with the COncrete DEfect BRidge IMage Dataset
Meta-learning Convolutional Neural Architectures for Multi-target Concrete Defect Classification with the COncrete DEfect BRidge IMage Dataset
Martin Mundt
Sagnik Majumder
S. Murali
P. Panetsos
Visvanathan Ramesh
14
97
0
02 Apr 2019
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Myle Ott
Sergey Edunov
Alexei Baevski
Angela Fan
Sam Gross
Nathan Ng
David Grangier
Michael Auli
VLM
FaML
23
3,132
0
01 Apr 2019
Adversarial Robustness vs Model Compression, or Both?
Adversarial Robustness vs Model Compression, or Both?
Shaokai Ye
Kaidi Xu
Sijia Liu
Jan-Henrik Lambrechts
Huan Zhang
Aojun Zhou
Kaisheng Ma
Yanzhi Wang
Xue Lin
AAML
25
163
0
29 Mar 2019
Automatic Spelling Correction with Transformer for CTC-based End-to-End
  Speech Recognition
Automatic Spelling Correction with Transformer for CTC-based End-to-End Speech Recognition
Shiliang Zhang
Ming Lei
Zhijie Yan
22
15
0
27 Mar 2019
Deep Demosaicing for Edge Implementation
Deep Demosaicing for Edge Implementation
R. Ramakrishnan
Shangling Jui
V. Nia
36
5
0
26 Mar 2019
Improving image classifiers for small datasets by learning rate
  adaptations
Improving image classifiers for small datasets by learning rate adaptations
Sourav Mishra
T. Yamasaki
Hideaki Imaizumi
14
11
0
26 Mar 2019
AlphaX: eXploring Neural Architectures with Deep Neural Networks and
  Monte Carlo Tree Search
AlphaX: eXploring Neural Architectures with Deep Neural Networks and Monte Carlo Tree Search
Linnan Wang
Yiyang Zhao
Yuu Jinnai
Yuandong Tian
Rodrigo Fonseca
BDL
25
95
0
26 Mar 2019
Cyclical Annealing Schedule: A Simple Approach to Mitigating KL
  Vanishing
Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing
Hao Fu
Chunyuan Li
Xiaodong Liu
Jianfeng Gao
Asli Celikyilmaz
Lawrence Carin
ODL
27
361
0
25 Mar 2019
sharpDARTS: Faster and More Accurate Differentiable Architecture Search
sharpDARTS: Faster and More Accurate Differentiable Architecture Search
Andrew Hundt
Varun Jain
Gregory Hager
OOD
30
66
0
23 Mar 2019
Pre-trained Language Model Representations for Language Generation
Pre-trained Language Model Representations for Language Generation
Sergey Edunov
Alexei Baevski
Michael Auli
27
129
0
22 Mar 2019
Gradient-only line searches: An Alternative to Probabilistic Line
  Searches
Gradient-only line searches: An Alternative to Probabilistic Line Searches
D. Kafka
D. Wilke
ODL
43
14
0
22 Mar 2019
In Defense of Pre-trained ImageNet Architectures for Real-time Semantic
  Segmentation of Road-driving Images
In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation of Road-driving Images
Marin Orsic
Ivan Kreso
Petra Bevandić
Sinisa Segvic
SSeg
22
339
0
20 Mar 2019
Convolution with even-sized kernels and symmetric padding
Convolution with even-sized kernels and symmetric padding
Shuang Wu
Guanrui Wang
Pei Tang
F. Chen
Luping Shi
22
68
0
20 Mar 2019
Cloze-driven Pretraining of Self-attention Networks
Cloze-driven Pretraining of Self-attention Networks
Alexei Baevski
Sergey Edunov
Yinhan Liu
Luke Zettlemoyer
Michael Auli
10
198
0
19 Mar 2019
A Distributed Hierarchical SGD Algorithm with Sparse Global Reduction
A Distributed Hierarchical SGD Algorithm with Sparse Global Reduction
Fan Zhou
Guojing Cong
19
8
0
12 Mar 2019
Interpolation Consistency Training for Semi-Supervised Learning
Interpolation Consistency Training for Semi-Supervised Learning
Vikas Verma
Kenji Kawaguchi
Alex Lamb
Arno Solin
Arno Solin
Yoshua Bengio
David Lopez-Paz
39
757
0
09 Mar 2019
Inductive Transfer for Neural Architecture Optimization
Inductive Transfer for Neural Architecture Optimization
Martin Wistuba
Tejaswini Pedapati
12
9
0
08 Mar 2019
Alternating Synthetic and Real Gradients for Neural Language Modeling
Fangxin Shang
Hao Zhang
21
1
0
27 Feb 2019
Modulated binary cliquenet
Modulated binary cliquenet
Jinpeng Xia
Jiasong Wu
Youyong Kong
Pinzheng Zhang
L. Senhadji
H. Shu
MQ
16
0
0
27 Feb 2019
NAS-Bench-101: Towards Reproducible Neural Architecture Search
NAS-Bench-101: Towards Reproducible Neural Architecture Search
Chris Ying
Aaron Klein
Esteban Real
Eric Christiansen
Kevin Patrick Murphy
Frank Hutter
12
673
0
25 Feb 2019
Learned Step Size Quantization
Learned Step Size Quantization
S. K. Esser
J. McKinstry
Deepika Bablani
R. Appuswamy
D. Modha
MQ
31
782
0
21 Feb 2019
Deep Learning Based Video System for Accurate and Real-Time Parking
  Measurement
Deep Learning Based Video System for Accurate and Real-Time Parking Measurement
B. Cai
Ricardo Alvarez
M. Sit
Fábio Duarte
C. Ratti
HAI
32
53
0
20 Feb 2019
Channel Max Pooling Layer for Fine-Grained Vehicle Classification
Channel Max Pooling Layer for Fine-Grained Vehicle Classification
Zhanyu Ma
Dongliang Chang
Xiaoxu Li
22
4
0
14 Feb 2019
Bag of Freebies for Training Object Detection Neural Networks
Bag of Freebies for Training Object Detection Neural Networks
Zhi-Li Zhang
Tong He
Hang Zhang
Zhongyue Zhang
Junyuan Xie
Mu Li
VLM
ObjD
22
188
0
11 Feb 2019
Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning
Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning
Ruqi Zhang
Chunyuan Li
Jianyi Zhang
Changyou Chen
A. Wilson
BDL
30
273
0
11 Feb 2019
Contextual Recurrent Neural Networks
Contextual Recurrent Neural Networks
Sam Wenke
J. Fleming
16
5
0
09 Feb 2019
Combining learning rate decay and weight decay with complexity gradient
  descent - Part I
Combining learning rate decay and weight decay with complexity gradient descent - Part I
Pierre Harvey Richemond
Yike Guo
30
4
0
07 Feb 2019
Artificial Intelligence for Prosthetics - challenge solutions
Artificial Intelligence for Prosthetics - challenge solutions
L. Kidzinski
Carmichael F. Ong
Sharada Mohanty
Jennifer Hicks
Sean F. Carroll
...
E. Tumer
J. Watson
M. Salathé
Sergey Levine
Scott L. Delp
15
40
0
07 Feb 2019
Re-examination of the Role of Latent Variables in Sequence Modeling
Re-examination of the Role of Latent Variables in Sequence Modeling
Zihang Dai
Guokun Lai
Yiming Yang
Shinjae Yoo
BDL
DRL
27
4
0
04 Feb 2019
Network Parameter Learning Using Nonlinear Transforms, Local
  Representation Goals and Local Propagation Constraints
Network Parameter Learning Using Nonlinear Transforms, Local Representation Goals and Local Propagation Constraints
Dimche Kostadinov
Behrooz Razdehi
Slava Voloshynovskiy
31
0
0
31 Jan 2019
Previous
123...8283848586
Next