ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.08610
  4. Cited By
Lookahead Optimizer: k steps forward, 1 step back
v1v2 (latest)

Lookahead Optimizer: k steps forward, 1 step back

19 July 2019
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
    ODL
ArXiv (abs)PDFHTML

Papers citing "Lookahead Optimizer: k steps forward, 1 step back"

50 / 357 papers shown
Title
BUT-FIT at SemEval-2020 Task 5: Automatic detection of counterfactual
  statements with deep pre-trained language representation models
BUT-FIT at SemEval-2020 Task 5: Automatic detection of counterfactual statements with deep pre-trained language representation models
Martin Fajcik
Josef Jon
Martin Docekal
Pavel Smrz
42
11
0
28 Jul 2020
Neural networks with late-phase weights
Neural networks with late-phase weights
J. Oswald
Seijin Kobayashi
Alexander Meulemans
Christian Henning
Benjamin Grewe
João Sacramento
94
35
0
25 Jul 2020
UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a
  Generic Framework for Handling Common Camera Distortion Models
UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a Generic Framework for Handling Common Camera Distortion Models
V. Kumar
S. Yogamani
Markus Bach
Christian Witt
Stefan Milz
Patrick Mäder
MDE
90
53
0
13 Jul 2020
Descending through a Crowded Valley - Benchmarking Deep Learning
  Optimizers
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
Robin M. Schmidt
Frank Schneider
Philipp Hennig
ODL
219
169
0
03 Jul 2020
On the Outsized Importance of Learning Rates in Local Update Methods
On the Outsized Importance of Learning Rates in Local Update Methods
Zachary B. Charles
Jakub Konecný
FedML
92
54
0
02 Jul 2020
Taming GANs with Lookahead-Minmax
Taming GANs with Lookahead-Minmax
Tatjana Chavdarova
Matteo Pagliardini
Sebastian U. Stich
François Fleuret
Martin Jaggi
GAN
61
27
0
25 Jun 2020
MRI Image Reconstruction via Learning Optimization Using Neural ODEs
MRI Image Reconstruction via Learning Optimization Using Neural ODEs
Eric Z. Chen
Terrence Chen
Shanhui Sun
125
23
0
24 Jun 2020
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of
  Gradients
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of Gradients
Chenfei Zhu
Yu Cheng
Zhe Gan
Furong Huang
Jingjing Liu
Tom Goldstein
ODL
111
2
0
21 Jun 2020
Lookahead Adversarial Learning for Near Real-Time Semantic Segmentation
Lookahead Adversarial Learning for Near Real-Time Semantic Segmentation
Hadi Jamali Rad
Attila Szabo
SSeg
76
1
0
19 Jun 2020
AdamP: Slowing Down the Slowdown for Momentum Optimizers on
  Scale-invariant Weights
AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights
Byeongho Heo
Sanghyuk Chun
Seong Joon Oh
Dongyoon Han
Sangdoo Yun
Gyuwan Kim
Youngjung Uh
Jung-Woo Ha
ODL
380
27
0
15 Jun 2020
Entropic gradient descent algorithms and wide flat minima
Entropic gradient descent algorithms and wide flat minima
Fabrizio Pittorino
Carlo Lucibello
Christoph Feinauer
Gabriele Perugini
Carlo Baldassi
Elizaveta Demyanenko
R. Zecchina
ODLMLT
113
33
0
14 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSLVLM
173
437
0
11 Jun 2020
sEMG Gesture Recognition with a Simple Model of Attention
sEMG Gesture Recognition with a Simple Model of Attention
David Josephs
Carson Drake
Andrew Heroy
John Santerre
64
48
0
05 Jun 2020
CoolMomentum: A Method for Stochastic Optimization by Langevin Dynamics
  with Simulated Annealing
CoolMomentum: A Method for Stochastic Optimization by Langevin Dynamics with Simulated Annealing
O. Borysenko
M. Byshkin
ODL
60
14
0
29 May 2020
Adaptive Transformers for Learning Multimodal Representations
Adaptive Transformers for Learning Multimodal Representations
Prajjwal Bhargava
21
4
0
15 May 2020
Neural Networks Versus Conventional Filters for Inertial-Sensor-based
  Attitude Estimation
Neural Networks Versus Conventional Filters for Inertial-Sensor-based Attitude Estimation
Daniel Weber
C. Gühmann
Thomas Seel
37
35
0
14 May 2020
2kenize: Tying Subword Sequences for Chinese Script Conversion
2kenize: Tying Subword Sequences for Chinese Script Conversion
Pranav A
Isabelle Augenstein
66
1
0
07 May 2020
BlackBox: Generalizable Reconstruction of Extremal Values from
  Incomplete Spatio-Temporal Data
BlackBox: Generalizable Reconstruction of Extremal Values from Incomplete Spatio-Temporal Data
T. Ivek
Domagoj Vlah
66
4
0
30 Apr 2020
How do Decisions Emerge across Layers in Neural Models? Interpretation
  with Differentiable Masking
How do Decisions Emerge across Layers in Neural Models? Interpretation with Differentiable Masking
Nicola De Cao
Michael Schlichtkrull
Wilker Aziz
Ivan Titov
76
92
0
30 Apr 2020
Multi-view Self-Constructing Graph Convolutional Networks with Adaptive
  Class Weighting Loss for Semantic Segmentation
Multi-view Self-Constructing Graph Convolutional Networks with Adaptive Class Weighting Loss for Semantic Segmentation
Qinghui Liu
Michael C. Kampffmeyer
Robert Jenssen
Arnt-Børre Salberg
SSL
56
36
0
21 Apr 2020
An Adaptive Intelligence Algorithm for Undersampled Knee MRI
  Reconstruction
An Adaptive Intelligence Algorithm for Undersampled Knee MRI Reconstruction
Nicola Pezzotti
Sahar Yousefi
Mohamed S. Elmahdy
J. V. Gemert
C. Schulke
...
Sergey Kastryulin
B. Lelieveldt
M. Osch
E. Weerdt
Marius Staring
70
100
0
15 Apr 2020
An Evaluation of DNN Architectures for Page Segmentation of Historical
  Newspapers
An Evaluation of DNN Architectures for Page Segmentation of Historical Newspapers
Bernhard Liebl
M. Burghardt
SSeg
31
11
0
15 Apr 2020
Self6D: Self-Supervised Monocular 6D Object Pose Estimation
Self6D: Self-Supervised Monocular 6D Object Pose Estimation
Gu Wang
Fabian Manhardt
Jianzhun Shao
Xiangyang Ji
Nassir Navab
Federico Tombari
SSLMDE
100
137
0
14 Apr 2020
Detached Error Feedback for Distributed SGD with Random Sparsification
Detached Error Feedback for Distributed SGD with Random Sparsification
An Xu
Heng-Chiao Huang
71
9
0
11 Apr 2020
Applying Cyclical Learning Rate to Neural Machine Translation
Applying Cyclical Learning Rate to Neural Machine Translation
Choon Meng Lee
Jianfeng Liu
Wei Peng
ODL
27
2
0
06 Apr 2020
Multi-Plateau Ensemble for Endoscopic Artefact Segmentation and
  Detection
Multi-Plateau Ensemble for Endoscopic Artefact Segmentation and Detection
Suyog Jadhav
Udbhav Bamba
Arnav Chavan
Rishabh Tiwari
A. Raj
38
3
0
23 Mar 2020
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Huiyu Wang
Yukun Zhu
Bradley Green
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
3DPC
136
676
0
17 Mar 2020
Encoder-Decoder Based Convolutional Neural Networks with
  Multi-Scale-Aware Modules for Crowd Counting
Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting
Pongpisit Thanasutives
Ken-ichi Fukui
M. Numao
B. Kijsirikul
151
65
0
12 Mar 2020
Flexible numerical optimization with ensmallen
Flexible numerical optimization with ensmallen
Ryan R. Curtin
Marcus Edel
Rahul Prabhu
S. Basak
Zhihao Lou
Conrad Sanderson
76
1
0
09 Mar 2020
Train-by-Reconnect: Decoupling Locations of Weights from their Values
Train-by-Reconnect: Decoupling Locations of Weights from their Values
Yushi Qiu
R. Suda
20
0
0
05 Mar 2020
Colored Noise Injection for Training Adversarially Robust Neural
  Networks
Colored Noise Injection for Training Adversarially Robust Neural Networks
Evgenii Zheltonozhskii
Chaim Baskin
Yaniv Nemcovsky
Brian Chmiel
A. Mendelson
A. Bronstein
AAML
32
5
0
04 Mar 2020
3D dynamic hand gestures recognition using the Leap Motion sensor and
  convolutional neural networks
3D dynamic hand gestures recognition using the Leap Motion sensor and convolutional neural networks
Katia Lupinetti
A. Ranieri
F. Giannini
M. Monti
SLR
53
29
0
03 Mar 2020
A New Dataset, Poisson GAN and AquaNet for Underwater Object Grabbing
A New Dataset, Poisson GAN and AquaNet for Underwater Object Grabbing
Chongwei Liu
Zhihui Wang
Shijie Wang
Tao Tang
Yulong Tao
Caifei Yang
Haojie Li
Xing Liu
Xin-Yue Fan
62
53
0
03 Mar 2020
Iterative Averaging in the Quest for Best Test Error
Iterative Averaging in the Quest for Best Test Error
Diego Granziol
Xingchen Wan
Samuel Albanie
Stephen J. Roberts
74
3
0
02 Mar 2020
Adaptive Federated Optimization
Adaptive Federated Optimization
Sashank J. Reddi
Zachary B. Charles
Manzil Zaheer
Zachary Garrett
Keith Rush
Jakub Konecný
Sanjiv Kumar
H. B. McMahan
FedML
223
1,461
0
29 Feb 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast
  Convergence
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence
Nicolas Loizou
Sharan Vaswani
I. Laradji
Simon Lacoste-Julien
105
189
0
24 Feb 2020
From English To Foreign Languages: Transferring Pre-trained Language
  Models
From English To Foreign Languages: Transferring Pre-trained Language Models
Ke M. Tran
55
52
0
18 Feb 2020
Meta-learning Extractors for Music Source Separation
Meta-learning Extractors for Music Source Separation
David Samuel
Aditya Ganeshan
Jason Naradowsky
93
62
0
17 Feb 2020
LaProp: Separating Momentum and Adaptivity in Adam
LaProp: Separating Momentum and Adaptivity in Adam
Liu Ziyin
Zhikang T.Wang
Masahito Ueda
ODL
70
18
0
12 Feb 2020
Evolutionary Neural Architecture Search for Retinal Vessel Segmentation
Evolutionary Neural Architecture Search for Retinal Vessel Segmentation
Zhun Fan
Jiahong Wei
Guijie Zhu
Jiajie Mo
Wenji Li
64
8
0
18 Jan 2020
Gradient descent with momentum --- to accelerate or to super-accelerate?
Gradient descent with momentum --- to accelerate or to super-accelerate?
Goran Nakerst
John Brennan
M. Haque
ODL
31
15
0
17 Jan 2020
Fine-grained Image Classification and Retrieval by Combining Visual and
  Locally Pooled Textual Features
Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual Features
Andrés Mafla
S. Dey
Ali Furkan Biten
Lluís Gómez
Dimosthenis Karatzas
60
26
0
14 Jan 2020
CProp: Adaptive Learning Rate Scaling from Past Gradient Conformity
CProp: Adaptive Learning Rate Scaling from Past Gradient Conformity
Konpat Preechakul
B. Kijsirikul
ODL
38
3
0
24 Dec 2019
Pyramid Convolutional RNN for MRI Image Reconstruction
Pyramid Convolutional RNN for MRI Image Reconstruction
Eric Z. Chen
Puyang Wang
Xiao Chen
Terrence Chen
Shanhui Sun
86
44
0
02 Dec 2019
Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for
  Generative Models
Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models
Giannis Daras
Augustus Odena
Han Zhang
A. Dimakis
106
56
0
27 Nov 2019
Merging Deterministic Policy Gradient Estimations with Varied
  Bias-Variance Tradeoff for Effective Deep Reinforcement Learning
Merging Deterministic Policy Gradient Estimations with Varied Bias-Variance Tradeoff for Effective Deep Reinforcement Learning
Gang Chen
71
4
0
24 Nov 2019
Weakly Supervised Multi-Task Learning for Cell Detection and
  Segmentation
Weakly Supervised Multi-Task Learning for Cell Detection and Segmentation
Alireza Chamanzar
Yao Nie
56
55
0
27 Oct 2019
Filterbank design for end-to-end speech separation
Filterbank design for end-to-end speech separation
Manuel Pariente
Samuele Cornell
Antoine Deleforge
Emmanuel Vincent
113
69
0
23 Oct 2019
SlowMo: Improving Communication-Efficient Distributed SGD with Slow
  Momentum
SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum
Jianyu Wang
Vinayak Tantia
Nicolas Ballas
Michael G. Rabbat
99
201
0
01 Oct 2019
MGBPv2: Scaling Up Multi-Grid Back-Projection Networks
MGBPv2: Scaling Up Multi-Grid Back-Projection Networks
Pablo Navarrete Michelini
Wenbin Chen
Hanwen Liu
Dan Zhu
65
7
0
27 Sep 2019
Previous
12345678
Next