Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1907.08610
Cited By
v1
v2 (latest)
Lookahead Optimizer: k steps forward, 1 step back
19 July 2019
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Lookahead Optimizer: k steps forward, 1 step back"
50 / 357 papers shown
Title
Towards Principled Task Grouping for Multi-Task Learning
Chenguang Wang
Xuanhao Pan
Tianshu Yu
189
1
0
23 Feb 2024
Gradual Residuals Alignment: A Dual-Stream Framework for GAN Inversion and Image Attribute Editing
Hao Li
Mengqi Huang
Lei Zhang
Bo Hu
Yi Liu
Zhendong Mao
DiffM
99
2
0
22 Feb 2024
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Siyuan Li
Zicheng Liu
Juanxi Tian
Ge Wang
Zedong Wang
...
Cheng Tan
Tao Lin
Yang Liu
Baigui Sun
Stan Z. Li
66
6
0
14 Feb 2024
Multi-Scale Semantic Segmentation with Modified MBConv Blocks
Xi Chen
Yang Cai
Yuan Wu
Bo Xiong
Taesung Park
SSeg
76
0
0
07 Feb 2024
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu
Peter Bailis
Ion Stoica
Hao Zhang
204
164
0
03 Feb 2024
A Note On Lookahead In Real Life And Computing
Burle Sharma
Rakesh Mohanty
Sucheta Panda
41
0
0
02 Feb 2024
Making Parametric Anomaly Detection on Tabular Data Non-Parametric Again
Hugo Thimonier
Fabrice Popineau
Arpad Rimmel
Bich-Liên Doan
94
2
0
30 Jan 2024
Finetuning Foundation Models for Joint Analysis Optimization
M. Vigl
N. Hartman
L. Heinrich
96
14
0
24 Jan 2024
Enhancing Digital Hologram Reconstruction Using Reverse-Attention Loss for Untrained Physics-Driven Deep Learning Models with Uncertain Distance
Xiwen Chen
Hao Wang
Zhao Zhang
Zhenmin Li
Huayu Li
Tong Ye
Abolfazl Razi
54
1
0
11 Jan 2024
Brain Tumor Segmentation Based on Deep Learning, Attention Mechanisms, and Energy-Based Uncertainty Prediction
Zachary Schwehr
Sriman Achanta
79
3
0
31 Dec 2023
Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
Yefan Zhou
Tianyu Pang
Keqin Liu
Charles H. Martin
Michael W. Mahoney
Yaoqing Yang
143
12
0
01 Dec 2023
Locally Optimal Descent for Dynamic Stepsize Scheduling
Gilad Yehudai
Alon Cohen
Amit Daniely
Yoel Drori
Tomer Koren
Mariano Schain
93
0
0
23 Nov 2023
A Coefficient Makes SVRG Effective
Yida Yin
Zhiqiu Xu
Zhiyuan Li
Trevor Darrell
Zhuang Liu
91
1
0
09 Nov 2023
Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization
Liang Zhang
Junchi Yang
Amin Karbasi
Niao He
88
2
0
26 Oct 2023
Implicit meta-learning may lead language models to trust more reliable sources
Dmitrii Krasheninnikov
Egor Krasheninnikov
Bruno Mlodozeniec
Tegan Maharaj
David M. Krueger
86
4
0
23 Oct 2023
Stable Nonconvex-Nonconcave Training via Linear Interpolation
Thomas Pethick
Wanyun Xie
Volkan Cevher
66
6
0
20 Oct 2023
Over-the-Air Federated Learning and Optimization
Jingyang Zhu
Yuanming Shi
Yong Zhou
Chunxiao Jiang
Wei Chen
Khaled B. Letaief
FedML
105
13
0
16 Oct 2023
Deep Model Fusion: A Survey
Weishi Li
Yong Peng
Miao Zhang
Liang Ding
Han Hu
Li Shen
FedML
MoMe
117
62
0
27 Sep 2023
Exploring Flat Minima for Domain Generalization with Large Learning Rates
Jian Zhang
Lei Qi
Yinghuan Shi
Yang Gao
81
3
0
12 Sep 2023
Estimating exercise-induced fatigue from thermal facial images
Manuel Lage Cañellas
Constantino Álvarez Casado
L. Nguyen
Miguel Bordallo López
CVBM
48
0
0
12 Sep 2023
Stabilizing RNN Gradients through Pre-training
Luca Herranz-Celotti
Jean Rouat
114
1
0
23 Aug 2023
Deepbet: Fast brain extraction of T1-weighted MRI using Convolutional Neural Networks
L. Fisch
Stefan Zumdick
Carlotta B. C. Barkhau
D. Emden
J. Ernsting
...
K. Sarink
N. Winter
Benjamin Risse
U. Dannlowski
Tim Hahn
59
6
0
14 Aug 2023
MomentaMorph: Unsupervised Spatial-Temporal Registration with Momenta, Shooting, and Correction
Zhangxing Bian
Shuwen Wei
Yihao Liu
Junyu Chen
J. Zhuo
Fangxu Xing
Jonghye Woo
A. Carass
Jerry L. Prince
MedIm
57
2
0
05 Aug 2023
Multimodal Indoor Localisation in Parkinson's Disease for Detecting Medication Use: Observational Pilot Study in a Free-Living Setting
Ferdian Jovan
Catherine Morgan
Ryan McConville
E. Tonkin
I. Craddock
Alan Whone
32
3
0
03 Aug 2023
MFIM: Megapixel Facial Identity Manipulation
Sanghyeon Na
PICV
CVBM
62
4
0
03 Aug 2023
Lookbehind-SAM: k steps back, 1 step forward
Gonçalo Mordido
Pranshu Malviya
A. Baratin
Sarath Chandar
AAML
90
1
0
31 Jul 2023
LaFiCMIL: Rethinking Large File Classification from the Perspective of Correlated Multiple Instance Learning
Tiezhu Sun
Weiguo Pian
N. Daoudi
Kevin Allix
Tegawende F. Bissyande
Jacques Klein
128
1
0
30 Jul 2023
StylePrompter: All Styles Need Is Attention
Chenyi Zhuang
Pan Gao
A. Smolic
77
1
0
30 Jul 2023
Cross-dimensional transfer learning in medical image segmentation with deep learning
Hicham Messaoudi
Ahror Belaid
Douraied BEN SALEM
Pierre-Henri Conze
MedIm
91
27
0
29 Jul 2023
TransNet: Transparent Object Manipulation Through Category-Level Pose Estimation
Huijie Zhang
Anthony Opipari
Xiaotong Chen
Jiyue Zhu
Zeren Yu
Odest Chadwicke Jenkins
51
1
0
23 Jul 2023
Promoting Exploration in Memory-Augmented Adam using Critical Momenta
Pranshu Malviya
Gonçalo Mordido
A. Baratin
Reza Babanezhad Harikandeh
Jerry Huang
Simon Lacoste-Julien
Razvan Pascanu
Sarath Chandar
ODL
50
1
0
18 Jul 2023
Dual-Query Multiple Instance Learning for Dynamic Meta-Embedding based Tumor Classification
Simon Holdenried-Krafft
Peter Somers
Ivonne A. Montes-Majarro
Diana Silimon
Cristina Tarín
F. Fend
Hendrik P. A. Lensch
MedIm
102
3
0
14 Jul 2023
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
112
45
0
12 Jul 2023
The Whole Pathological Slide Classification via Weakly Supervised Learning
Qiehe Sun
Jiawen Li
Jin Xu
Junru Cheng
Tian Guan
Yonghong He
59
0
0
12 Jul 2023
Neural Architecture Transfer 2: A Paradigm for Improving Efficiency in Multi-Objective Neural Architecture Search
Simone Sarti
Eugenio Lomurno
Matteo Matteucci
58
1
0
03 Jul 2023
Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers
Yineng Chen
Z. Li
Lefei Zhang
Bo Du
Hai Zhao
70
4
0
02 Jul 2023
Structured State Space Models for Multiple Instance Learning in Digital Pathology
Leo Fillioux
Joseph Boyd
Maria Vakalopoulou
P. Cournède
Stergios Christodoulidis
55
24
0
27 Jun 2023
Efficient ResNets: Residual Network Design
Aditya Thakur
Harish Chauhan
Nikunj Gupta
35
0
0
21 Jun 2023
Partial Hypernetworks for Continual Learning
Hamed Hemati
Vincenzo Lomonaco
D. Bacciu
Damian Borth
CLL
78
7
0
19 Jun 2023
Algorithms of Sampling-Frequency-Independent Layers for Non-integer Strides
Kanami Imamura
Tomohiko Nakamura
Norihiro Takamune
Kohei Yatabe
Hiroshi Saruwatari
58
2
0
19 Jun 2023
Lookaround Optimizer:
k
k
k
steps around, 1 step average
Jiangtao Zhang
Shunyu Liu
Mingli Song
Tongtian Zhu
Zhenxing Xu
Mingli Song
MoMe
113
6
0
13 Jun 2023
Single-Stage 3D Geometry-Preserving Depth Estimation Model Training on Dataset Mixtures with Uncalibrated Stereo Data
Nikolay Patakin
Mikhail Romanov
Anna Vorontsova
M. Artemyev
Anton Konushin
MDE
86
6
0
05 Jun 2023
SING: A Plug-and-Play DNN Learning Technique
Adrien Courtois
Damien Scieur
Jean-Michel Morel
Pablo Arias
Thomas Eboli
68
0
0
25 May 2023
Revisiting Token Dropping Strategy in Efficient BERT Pretraining
Qihuang Zhong
Liang Ding
Juhua Liu
Xuebo Liu
Min Zhang
Bo Du
Dacheng Tao
VLM
75
10
0
24 May 2023
Beyond Individual Input for Deep Anomaly Detection on Tabular Data
Hugo Thimonier
Fabrice Popineau
Arpad Rimmel
Bich-Liên Doan
84
6
0
24 May 2023
Deep Multiple Instance Learning with Distance-Aware Self-Attention
Georg Wolflein
Lucie Charlotte Magister
Pietro Lio
David J. Harrison
Ognjen Arandjelovic
63
3
0
17 May 2023
Semi-Supervised Segmentation of Functional Tissue Units at the Cellular Level
V. Sydorskyi
Igor Krashenyi
Denis Savka
Oleksandr Zarichkovyi
41
1
0
03 May 2023
SketchXAI: A First Look at Explainability for Human Sketches
Zhiyu Qu
Yulia Gryaditskaya
Ke Li
Kaiyue Pang
Tao Xiang
Yi-Zhe Song
89
8
0
23 Apr 2023
Hierarchical Weight Averaging for Deep Neural Networks
Xiaozhe Gu
Zixun Zhang
Yuncheng Jiang
Yaoyu Zhang
Ruimao Zhang
Shuguang Cui
Zhuguo Li
62
5
0
23 Apr 2023
Neuromorphic computing for attitude estimation onboard quadrotors
S. Stroobants
Julien Dupeyroux
Guido C. H. E de Croon
66
4
0
18 Apr 2023
Previous
1
2
3
4
5
6
7
8
Next