ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.08610
  4. Cited By
Lookahead Optimizer: k steps forward, 1 step back

Lookahead Optimizer: k steps forward, 1 step back

19 July 2019
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
    ODL
ArXivPDFHTML

Papers citing "Lookahead Optimizer: k steps forward, 1 step back"

50 / 347 papers shown
Title
Locally Optimal Descent for Dynamic Stepsize Scheduling
Locally Optimal Descent for Dynamic Stepsize Scheduling
Gilad Yehudai
Alon Cohen
Amit Daniely
Yoel Drori
Tomer Koren
Mariano Schain
37
0
0
23 Nov 2023
A Coefficient Makes SVRG Effective
A Coefficient Makes SVRG Effective
Yida Yin
Zhiqiu Xu
Zhiyuan Li
Trevor Darrell
Zhuang Liu
33
1
0
09 Nov 2023
Optimal Guarantees for Algorithmic Reproducibility and Gradient
  Complexity in Convex Optimization
Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization
Liang Zhang
Junchi Yang
Amin Karbasi
Niao He
31
2
0
26 Oct 2023
Implicit meta-learning may lead language models to trust more reliable
  sources
Implicit meta-learning may lead language models to trust more reliable sources
Dmitrii Krasheninnikov
Egor Krasheninnikov
Bruno Mlodozeniec
Tegan Maharaj
David M. Krueger
26
3
0
23 Oct 2023
Stable Nonconvex-Nonconcave Training via Linear Interpolation
Stable Nonconvex-Nonconcave Training via Linear Interpolation
Thomas Pethick
Wanyun Xie
V. Cevher
32
5
0
20 Oct 2023
Over-the-Air Federated Learning and Optimization
Over-the-Air Federated Learning and Optimization
Jingyang Zhu
Yuanming Shi
Yong Zhou
Chunxiao Jiang
Wei Chen
Khaled B. Letaief
FedML
23
11
0
16 Oct 2023
Deep Model Fusion: A Survey
Deep Model Fusion: A Survey
Weishi Li
Yong Peng
Miao Zhang
Liang Ding
Han Hu
Li Shen
FedML
MoMe
33
52
0
27 Sep 2023
Exploring Flat Minima for Domain Generalization with Large Learning
  Rates
Exploring Flat Minima for Domain Generalization with Large Learning Rates
Jian Zhang
Lei Qi
Yinghuan Shi
Yang Gao
41
2
0
12 Sep 2023
Estimating exercise-induced fatigue from thermal facial images
Estimating exercise-induced fatigue from thermal facial images
Manuel Lage Cañellas
Constantino Álvarez Casado
L. Nguyen
Miguel Bordallo López
CVBM
16
0
0
12 Sep 2023
Stabilizing RNN Gradients through Pre-training
Stabilizing RNN Gradients through Pre-training
Luca Herranz-Celotti
Jean Rouat
24
0
0
23 Aug 2023
Deepbet: Fast brain extraction of T1-weighted MRI using Convolutional
  Neural Networks
Deepbet: Fast brain extraction of T1-weighted MRI using Convolutional Neural Networks
L. Fisch
Stefan Zumdick
Carlotta B. C. Barkhau
D. Emden
J. Ernsting
...
K. Sarink
N. Winter
Benjamin Risse
U. Dannlowski
Tim Hahn
21
4
0
14 Aug 2023
MomentaMorph: Unsupervised Spatial-Temporal Registration with Momenta,
  Shooting, and Correction
MomentaMorph: Unsupervised Spatial-Temporal Registration with Momenta, Shooting, and Correction
Zhangxing Bian
Shuwen Wei
Yihao Liu
Junyu Chen
J. Zhuo
Fangxu Xing
Jonghye Woo
A. Carass
Jerry L. Prince
MedIm
24
2
0
05 Aug 2023
Multimodal Indoor Localisation in Parkinson's Disease for Detecting
  Medication Use: Observational Pilot Study in a Free-Living Setting
Multimodal Indoor Localisation in Parkinson's Disease for Detecting Medication Use: Observational Pilot Study in a Free-Living Setting
Ferdian Jovan
Catherine Morgan
Ryan McConville
E. Tonkin
I. Craddock
Alan Whone
25
2
0
03 Aug 2023
MFIM: Megapixel Facial Identity Manipulation
MFIM: Megapixel Facial Identity Manipulation
Sanghyeon Na
PICV
CVBM
38
4
0
03 Aug 2023
Lookbehind-SAM: k steps back, 1 step forward
Lookbehind-SAM: k steps back, 1 step forward
Gonçalo Mordido
Pranshu Malviya
A. Baratin
Sarath Chandar
AAML
45
1
0
31 Jul 2023
LaFiCMIL: Rethinking Large File Classification from the Perspective of
  Correlated Multiple Instance Learning
LaFiCMIL: Rethinking Large File Classification from the Perspective of Correlated Multiple Instance Learning
Tiezhu Sun
Weiguo Pian
N. Daoudi
Kevin Allix
Tegawende F. Bissyande
Jacques Klein
31
1
0
30 Jul 2023
StylePrompter: All Styles Need Is Attention
StylePrompter: All Styles Need Is Attention
Chenyi Zhuang
Pan Gao
A. Smolic
38
1
0
30 Jul 2023
Cross-dimensional transfer learning in medical image segmentation with
  deep learning
Cross-dimensional transfer learning in medical image segmentation with deep learning
Hicham Messaoudi
Ahror Belaid
Douraied BEN SALEM
Pierre-Henri Conze
MedIm
30
23
0
29 Jul 2023
TransNet: Transparent Object Manipulation Through Category-Level Pose
  Estimation
TransNet: Transparent Object Manipulation Through Category-Level Pose Estimation
Huijie Zhang
Anthony Opipari
Xiaotong Chen
Jiyue Zhu
Zeren Yu
Odest Chadwicke Jenkins
19
1
0
23 Jul 2023
Promoting Exploration in Memory-Augmented Adam using Critical Momenta
Promoting Exploration in Memory-Augmented Adam using Critical Momenta
Pranshu Malviya
Gonçalo Mordido
A. Baratin
Reza Babanezhad Harikandeh
Jerry Huang
Simon Lacoste-Julien
Razvan Pascanu
Sarath Chandar
ODL
30
1
0
18 Jul 2023
Dual-Query Multiple Instance Learning for Dynamic Meta-Embedding based
  Tumor Classification
Dual-Query Multiple Instance Learning for Dynamic Meta-Embedding based Tumor Classification
Simon Holdenried-Krafft
Peter Somers
Ivonne A. Montes-Majarro
Diana Silimon
Cristina Tarín
F. Fend
Hendrik P. A. Lensch
MedIm
33
3
0
14 Jul 2023
No Train No Gain: Revisiting Efficient Training Algorithms For
  Transformer-based Language Models
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
22
41
0
12 Jul 2023
The Whole Pathological Slide Classification via Weakly Supervised
  Learning
The Whole Pathological Slide Classification via Weakly Supervised Learning
Qiehe Sun
Jiawen Li
Jin Xu
Junru Cheng
Tian Guan
Yonghong He
37
0
0
12 Jul 2023
Neural Architecture Transfer 2: A Paradigm for Improving Efficiency in
  Multi-Objective Neural Architecture Search
Neural Architecture Transfer 2: A Paradigm for Improving Efficiency in Multi-Objective Neural Architecture Search
Simone Sarti
Eugenio Lomurno
Matteo Matteucci
19
0
0
03 Jul 2023
Bidirectional Looking with A Novel Double Exponential Moving Average to
  Adaptive and Non-adaptive Momentum Optimizers
Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers
Yineng Chen
Z. Li
Lefei Zhang
Bo Du
Hai Zhao
33
4
0
02 Jul 2023
Structured State Space Models for Multiple Instance Learning in Digital
  Pathology
Structured State Space Models for Multiple Instance Learning in Digital Pathology
Leo Fillioux
Joseph Boyd
Maria Vakalopoulou
P. Cournède
Stergios Christodoulidis
8
21
0
27 Jun 2023
Efficient ResNets: Residual Network Design
Efficient ResNets: Residual Network Design
Aditya Thakur
Harish Chauhan
Nikunj Gupta
21
0
0
21 Jun 2023
Partial Hypernetworks for Continual Learning
Partial Hypernetworks for Continual Learning
Hamed Hemati
Vincenzo Lomonaco
D. Bacciu
Damian Borth
CLL
24
7
0
19 Jun 2023
Algorithms of Sampling-Frequency-Independent Layers for Non-integer
  Strides
Algorithms of Sampling-Frequency-Independent Layers for Non-integer Strides
Kanami Imamura
Tomohiko Nakamura
Norihiro Takamune
Kohei Yatabe
Hiroshi Saruwatari
23
1
0
19 Jun 2023
Lookaround Optimizer: $k$ steps around, 1 step average
Lookaround Optimizer: kkk steps around, 1 step average
Jiangtao Zhang
Shunyu Liu
Mingli Song
Tongtian Zhu
Zhenxing Xu
Mingli Song
MoMe
34
6
0
13 Jun 2023
Single-Stage 3D Geometry-Preserving Depth Estimation Model Training on
  Dataset Mixtures with Uncalibrated Stereo Data
Single-Stage 3D Geometry-Preserving Depth Estimation Model Training on Dataset Mixtures with Uncalibrated Stereo Data
Nikolay Patakin
Mikhail Romanov
Anna Vorontsova
M. Artemyev
Anton Konushin
MDE
46
6
0
05 Jun 2023
SING: A Plug-and-Play DNN Learning Technique
SING: A Plug-and-Play DNN Learning Technique
Adrien Courtois
Damien Scieur
Jean-Michel Morel
Pablo Arias
Thomas Eboli
33
0
0
25 May 2023
Revisiting Token Dropping Strategy in Efficient BERT Pretraining
Revisiting Token Dropping Strategy in Efficient BERT Pretraining
Qihuang Zhong
Liang Ding
Juhua Liu
Xuebo Liu
Min Zhang
Bo Du
Dacheng Tao
VLM
34
9
0
24 May 2023
Beyond Individual Input for Deep Anomaly Detection on Tabular Data
Beyond Individual Input for Deep Anomaly Detection on Tabular Data
Hugo Thimonier
Fabrice Popineau
Arpad Rimmel
Bich-Liên Doan
20
5
0
24 May 2023
Deep Multiple Instance Learning with Distance-Aware Self-Attention
Deep Multiple Instance Learning with Distance-Aware Self-Attention
Georg Wolflein
Lucie Charlotte Magister
Pietro Lio
David J. Harrison
Ognjen Arandjelovic
24
2
0
17 May 2023
Semi-Supervised Segmentation of Functional Tissue Units at the Cellular
  Level
Semi-Supervised Segmentation of Functional Tissue Units at the Cellular Level
V. Sydorskyi
Igor Krashenyi
Denis Savka
Oleksandr Zarichkovyi
18
1
0
03 May 2023
SketchXAI: A First Look at Explainability for Human Sketches
SketchXAI: A First Look at Explainability for Human Sketches
Zhiyu Qu
Yulia Gryaditskaya
Ke Li
Kaiyue Pang
Tao Xiang
Yi-Zhe Song
28
8
0
23 Apr 2023
Hierarchical Weight Averaging for Deep Neural Networks
Hierarchical Weight Averaging for Deep Neural Networks
Xiaozhe Gu
Zixun Zhang
Yuncheng Jiang
Tao Luo
Ruimao Zhang
Shuguang Cui
Zhuguo Li
27
5
0
23 Apr 2023
Neuromorphic computing for attitude estimation onboard quadrotors
Neuromorphic computing for attitude estimation onboard quadrotors
S. Stroobants
Julien Dupeyroux
Guido C. H. E de Croon
35
4
0
18 Apr 2023
VISION DIFFMASK: Faithful Interpretation of Vision Transformers with
  Differentiable Patch Masking
VISION DIFFMASK: Faithful Interpretation of Vision Transformers with Differentiable Patch Masking
A. Nalmpantis
Apostolos Panagiotopoulos
John Gkountouras
Konstantinos Papakostas
Wilker Aziz
15
4
0
13 Apr 2023
$β$-Variational autoencoders and transformers for reduced-order
  modelling of fluid flows
βββ-Variational autoencoders and transformers for reduced-order modelling of fluid flows
Alberto Solera-Rico
Carlos Sanmiguel Vila
Miguel Gómez-López
Yuning Wang
Abdulrahman Almashjary
Scott T. M. Dawson
Ricardo Vinuesa
DRL
16
74
0
07 Apr 2023
Astroformer: More Data Might not be all you need for Classification
Astroformer: More Data Might not be all you need for Classification
Rishit Dagli
30
7
0
03 Apr 2023
HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose
  Estimation
HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose Estimation
Linfang Zheng
Chen Wang
Ying Sun
Esha Dasgupta
Hua Chen
A. Leonardis
Wei Zhang
H. Chang
3DPC
42
41
0
28 Mar 2023
TriPlaneNet: An Encoder for EG3D Inversion
TriPlaneNet: An Encoder for EG3D Inversion
A. Bhattarai
Matthias Nießner
Artem Sevastopolsky
43
34
0
23 Mar 2023
Make Encoder Great Again in 3D GAN Inversion through Geometry and
  Occlusion-Aware Encoding
Make Encoder Great Again in 3D GAN Inversion through Geometry and Occlusion-Aware Encoding
Ziyang Yuan
Yiming Zhu
Yu Li
Hongyu Liu
Chun Yuan
3DV
32
37
0
22 Mar 2023
Picture that Sketch: Photorealistic Image Generation from Abstract
  Sketches
Picture that Sketch: Photorealistic Image Generation from Abstract Sketches
Subhadeep Koley
A. Bhunia
Aneeshan Sain
Pinaki Nath Chowdhury
Tao Xiang
Yi-Zhe Song
3DH
21
31
0
20 Mar 2023
Judging Adam: Studying the Performance of Optimization Methods on ML4SE
  Tasks
Judging Adam: Studying the Performance of Optimization Methods on ML4SE Tasks
D. Pasechnyuk
Anton Prazdnichnykh
Mikhail Evtikhiev
T. Bryksin
32
1
0
06 Mar 2023
FoundationTTS: Text-to-Speech for ASR Customization with Generative
  Language Model
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model
Rui Xue
Yanqing Liu
Lei He
Xuejiao Tan
Linquan Liu
Ed Lin
Sheng Zhao
34
7
0
06 Mar 2023
Dropout Reduces Underfitting
Dropout Reduces Underfitting
Zhuang Liu
Zhi-Qin John Xu
Joseph Jin
Zhiqiang Shen
Trevor Darrell
40
36
0
02 Mar 2023
BEL: A Bag Embedding Loss for Transformer enhances Multiple Instance
  Whole Slide Image Classification
BEL: A Bag Embedding Loss for Transformer enhances Multiple Instance Whole Slide Image Classification
Daniel Sens
Ario Sadafi
F. P. Casale
Nassir Navab
Carsten Marr
ViT
MedIm
19
1
0
02 Mar 2023
Previous
1234567
Next