ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.08610
  4. Cited By
Lookahead Optimizer: k steps forward, 1 step back
v1v2 (latest)

Lookahead Optimizer: k steps forward, 1 step back

19 July 2019
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
    ODL
ArXiv (abs)PDFHTML

Papers citing "Lookahead Optimizer: k steps forward, 1 step back"

50 / 357 papers shown
Title
DGSAM: Domain Generalization via Individual Sharpness-Aware Minimization
DGSAM: Domain Generalization via Individual Sharpness-Aware Minimization
Youngjun Song
Youngsik Hwang
Jonghun Lee
Heechang Lee
Dong-Young Lim
AAML
124
0
0
01 Jul 2025
Deepfake Caricatures: Amplifying attention to artifacts increases deepfake detection by humans and machines
Deepfake Caricatures: Amplifying attention to artifacts increases deepfake detection by humans and machines
Camilo Luciano Fosco
Emilie Josephs
A. Andonian
Allen Lee
97
4
0
01 Jul 2025
Quantum-Inspired Differentiable Integral Neural Networks (QIDINNs): A Feynman-Based Architecture for Continuous Learning Over Streaming Data
Quantum-Inspired Differentiable Integral Neural Networks (QIDINNs): A Feynman-Based Architecture for Continuous Learning Over Streaming Data
Oscar Boullosa Dapena
24
0
0
13 Jun 2025
NoLoCo: No-all-reduce Low Communication Training Method for Large Models
NoLoCo: No-all-reduce Low Communication Training Method for Large Models
Jari Kolehmainen
Nikolay Blagoev
John Donaghy
Oğuzhan Ersoy
Christopher Nies
108
0
0
12 Jun 2025
Local Equivariance Error-Based Metrics for Evaluating Sampling-Frequency-Independent Property of Neural Network
Local Equivariance Error-Based Metrics for Evaluating Sampling-Frequency-Independent Property of Neural Network
Kanami Imamura
Tomohiko Nakamura
Norihiro Takamune
Kohei Yatabe
Hiroshi Saruwatari
64
0
0
04 Jun 2025
Investigating Mask-aware Prototype Learning for Tabular Anomaly Detection
Investigating Mask-aware Prototype Learning for Tabular Anomaly Detection
Ruiying Lu
Jinhan Liu
Chuan Du
D. Guo
OODAAML
61
0
0
03 Jun 2025
Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization
Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization
C. Tan
Yubo Zhou
Haishan Ye
Guang Dai
Junmin Liu
Zengjie Song
Jiangshe Zhang
Zixiang Zhao
Yunda Hao
Yong Xu
AAML
40
0
0
29 May 2025
Gradient Flow Matching for Learning Update Dynamics in Neural Network Training
Gradient Flow Matching for Learning Update Dynamics in Neural Network Training
Xiao Shou
Yanna Ding
Jianxi Gao
24
0
0
26 May 2025
Accelerating Learned Image Compression Through Modeling Neural Training Dynamics
Accelerating Learned Image Compression Through Modeling Neural Training Dynamics
Yichi Zhang
Zhihao Duan
Yuning Huang
Fengqing Zhu
237
0
0
23 May 2025
A Unified Gradient-based Framework for Task-agnostic Continual Learning-Unlearning
A Unified Gradient-based Framework for Task-agnostic Continual Learning-Unlearning
Zhehao Huang
Xinwen Cheng
Jing Zhang
Jinghao Zheng
Haoran Wang
Zhengbao He
Tao Li
Xiaolin Huang
CLL
85
1
0
21 May 2025
Enhancing Certified Robustness via Block Reflector Orthogonal Layers and Logit Annealing Loss
Enhancing Certified Robustness via Block Reflector Orthogonal Layers and Logit Annealing Loss
Bo-Han Lai
Pin-Han Huang
Bo-Han Kung
Shang-Tse Chen
70
0
0
21 May 2025
Learning by solving differential equations
Learning by solving differential equations
Benoit Dherin
Michael Munn
Hanna Mazzawi
Michael Wunder
Sourabh Medapati
Javier Gonzalvo
60
0
0
19 May 2025
A novel Neural-ODE model for the state of health estimation of lithium-ion battery using charging curve
A novel Neural-ODE model for the state of health estimation of lithium-ion battery using charging curve
Yiming Li
Man He
Jing Liu
53
0
0
09 May 2025
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
Zachary B. Charles
Gabriel Teston
Lucio Dery
Keith Rush
Nova Fallen
Zachary Garrett
Arthur Szlam
Arthur Douillard
459
6
0
12 Mar 2025
Hierarchical Semantic Compression for Consistent Image Semantic Restoration
Hierarchical Semantic Compression for Consistent Image Semantic Restoration
Shengxi Li
Zifu Zhang
Mai Xu
Lai Jiang
Yufan Liu
Ce Zhu
DiffM
76
0
0
24 Feb 2025
Carefully Blending Adversarial Training, Purification, and Aggregation Improves Adversarial Robustness
Carefully Blending Adversarial Training, Purification, and Aggregation Improves Adversarial Robustness
Emanuele Ballarin
A. Ansuini
Luca Bortolussi
AAML
184
0
0
20 Feb 2025
ReFlow6D: Refraction-Guided Transparent Object 6D Pose Estimation via Intermediate Representation Learning
ReFlow6D: Refraction-Guided Transparent Object 6D Pose Estimation via Intermediate Representation Learning
Hrishikesh Gupta
S. Thalhammer
Jean-Baptiste Weibel
Alexander Haberl
Markus Vincze
106
0
0
31 Dec 2024
A Unified Analysis of Federated Learning with Arbitrary Client Participation
A Unified Analysis of Federated Learning with Arbitrary Client Participation
Shiqiang Wang
Mingyue Ji
FedML
148
59
0
31 Dec 2024
Transformer-based toxin-protein interaction analysis prioritizes
  airborne particulate matter components with potential adverse health effects
Transformer-based toxin-protein interaction analysis prioritizes airborne particulate matter components with potential adverse health effects
Yan Zhu
Shihao Wang
Yong Han
Yao Lu
Shulan Qiu
Ling Jin
Xiangdong Li
Weixiong Zhang
115
1
0
21 Dec 2024
Distributed Sign Momentum with Local Steps for Training Transformers
Distributed Sign Momentum with Local Steps for Training Transformers
Shuhua Yu
Ding Zhou
Cong Xie
An Xu
Zhi-Li Zhang
Xin Liu
S. Kar
141
0
0
26 Nov 2024
Retinal Vessel Segmentation via Neuron Programming
Tingting Wu
Ruyi Min
Peixuan Song
Hengtao Guo
Tieyong Zeng
Feng-Lei Fan
81
0
0
17 Nov 2024
ParaGAN: A Scalable Distributed Training Framework for Generative
  Adversarial Networks
ParaGAN: A Scalable Distributed Training Framework for Generative Adversarial Networks
Ziji Shi
Jialin Li
Yang You
56
1
0
06 Nov 2024
OledFL: Unleashing the Potential of Decentralized Federated Learning via
  Opposite Lookahead Enhancement
OledFL: Unleashing the Potential of Decentralized Federated Learning via Opposite Lookahead Enhancement
Qinglun Li
Miao Zhang
Mengzhu Wang
Quanjun Yin
Li Shen
OODDFedML
64
0
0
09 Oct 2024
MECFormer: Multi-task Whole Slide Image Classification with Expert
  Consultation Network
MECFormer: Multi-task Whole Slide Image Classification with Expert Consultation Network
Doanh C. Bui
Jin Tae Kwak
DiffMMedIm
51
0
0
06 Oct 2024
Image First or Text First? Optimising the Sequencing of Modalities in
  Large Language Model Prompting and Reasoning Tasks
Image First or Text First? Optimising the Sequencing of Modalities in Large Language Model Prompting and Reasoning Tasks
Grant Wardle
Teo Susnjak
LRM
110
6
0
04 Oct 2024
Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from
  Single Images
Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images
Bahri Batuhan Bilecen
Ahmet Berke Gokmen
Aysegül Dündar
84
2
0
30 Sep 2024
Unified Gradient-Based Machine Unlearning with Remain Geometry
  Enhancement
Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement
Zhehao Huang
Xinwen Cheng
Jinghao Zheng
Haoran Wang
Zhengbao He
Tao Li
Xiaolin Huang
MU
110
9
0
29 Sep 2024
DetectBERT: Towards Full App-Level Representation Learning to Detect
  Android Malware
DetectBERT: Towards Full App-Level Representation Learning to Detect Android Malware
Tiezhu Sun
N. Daoudi
Kisub Kim
Kevin Allix
Tegawende F. Bissyande
Jacques Klein
64
3
0
29 Aug 2024
Decentralized Federated Learning with Model Caching on Mobile Agents
Decentralized Federated Learning with Model Caching on Mobile Agents
Xiaoyu Wang
Guojun Xiong
Houwei Cao
Jian Li
Yong Liu
FedML
94
1
0
26 Aug 2024
Rethinking Pre-Trained Feature Extractor Selection in Multiple Instance Learning for Whole Slide Image Classification
Rethinking Pre-Trained Feature Extractor Selection in Multiple Instance Learning for Whole Slide Image Classification
Bryan Wong
MunYong Yi
Mun Yong Yi
VLM
120
0
0
02 Aug 2024
Monocular pose estimation of articulated surgical instruments in open
  surgery
Monocular pose estimation of articulated surgical instruments in open surgery
Robert Spektor
Tom Friedman
Itay Or
Gil Bolotin
S. Laufer
93
0
0
16 Jul 2024
Latent Space Imaging
Latent Space Imaging
Matheus Souza
Yidan Zheng
Kaizhang Kang
Yogeshwar Nath Mishra
Qiang Fu
Wolfgang Heidrich
143
0
0
09 Jul 2024
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance
  Learning for Whole Slide Image Classification
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification
Wenhui Zhu
Xiwen Chen
Peijie Qiu
Aristeidis Sotiras
Abolfazl Razi
Yalin Wang
82
7
0
04 Jul 2024
PathoWAve: A Deep Learning-based Weight Averaging Method for Improving
  Domain Generalization in Histopathology Images
PathoWAve: A Deep Learning-based Weight Averaging Method for Improving Domain Generalization in Histopathology Images
Parastoo Sotoudeh Sharifi
M. Omair Ahmad
M. N. S. Swamy
MoMeOOD
96
0
0
21 Jun 2024
Neural network learns low-dimensional polynomials with SGD near the
  information-theoretic limit
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
Jason D. Lee
Kazusato Oko
Taiji Suzuki
Denny Wu
MLT
149
25
0
03 Jun 2024
The Road Less Scheduled
The Road Less Scheduled
Aaron Defazio
Xingyu Yang
Yang
Harsh Mehta
Konstantin Mishchenko
Ahmed Khaled
Ashok Cutkosky
120
60
0
24 May 2024
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Luca Pesce
Ludovic Stephan
141
18
0
24 May 2024
Prior-guided Diffusion Model for Cell Segmentation in Quantitative Phase
  Imaging
Prior-guided Diffusion Model for Cell Segmentation in Quantitative Phase Imaging
Zhuchen Shao
M. Anastasio
Hua Li
DiffMMedIm
59
1
0
10 May 2024
TimeMIL: Advancing Multivariate Time Series Classification via a
  Time-aware Multiple Instance Learning
TimeMIL: Advancing Multivariate Time Series Classification via a Time-aware Multiple Instance Learning
Xiwen Chen
Peijie Qiu
Wenhui Zhu
Huayu Li
Hao Wang
Aristeidis Sotiras
Yalin Wang
Abolfazl Razi
AI4TS
68
9
0
06 May 2024
Image segmentation of treated and untreated tumor spheroids by Fully Convolutional Networks
Image segmentation of treated and untreated tumor spheroids by Fully Convolutional Networks
Matthias Streller
S. Michlíková
Willy Ciecior
Katharina Lönnecke
L. Kunz-Schughart
Steffen Lange
Anja Voss-Böhme
123
1
0
02 May 2024
FisheyeDetNet: 360° Surround view Fisheye Camera based Object
  Detection System for Autonomous Driving
FisheyeDetNet: 360° Surround view Fisheye Camera based Object Detection System for Autonomous Driving
Ganesh Sistu
S. Yogamani
90
0
0
20 Apr 2024
RaSim: A Range-aware High-fidelity RGB-D Data Simulation Pipeline for
  Real-world Applications
RaSim: A Range-aware High-fidelity RGB-D Data Simulation Pipeline for Real-world Applications
Xingyu Liu
Chenyangguang Zhang
Gu Wang
Ruida Zhang
Xiangyang Ji
69
1
0
05 Apr 2024
94% on CIFAR-10 in 3.29 Seconds on a Single GPU
94% on CIFAR-10 in 3.29 Seconds on a Single GPU
Keller Jordan
VLM
70
5
0
30 Mar 2024
Revisiting Random Weight Perturbation for Efficiently Improving
  Generalization
Revisiting Random Weight Perturbation for Efficiently Improving Generalization
Tao Li
Qinghua Tao
Weihao Yan
Zehao Lei
Yingwen Wu
Kun Fang
Mingzhen He
Xiaolin Huang
AAML
114
6
0
30 Mar 2024
All-in-One: Heterogeneous Interaction Modeling for Cold-Start Rating
  Prediction
All-in-One: Heterogeneous Interaction Modeling for Cold-Start Rating Prediction
Shuheng Fang
Kangfei Zhao
Yu Rong
Zhixun Li
Jeffrey Xu Yu
101
0
0
26 Mar 2024
Predicting Perceived Gloss: Do Weak Labels Suffice?
Predicting Perceived Gloss: Do Weak Labels Suffice?
Julia Guerrero-Viu
J. Daniel Subias
Ana Serrano
Katherine R. Storrs
Roland W. Fleming
B. Masiá
Diego F. F. Gutierrez
77
2
0
26 Mar 2024
FedMIL: Federated-Multiple Instance Learning for Video Analysis with
  Optimized DPP Scheduling
FedMIL: Federated-Multiple Instance Learning for Video Analysis with Optimized DPP Scheduling
Ashish Bastola
Hao Wang
Xiwen Chen
Abolfazl Razi
68
1
0
26 Mar 2024
TexTile: A Differentiable Metric for Texture Tileability
TexTile: A Differentiable Metric for Texture Tileability
Carlos Rodriguez-Pardo
Dan Casas
Elena Garces
Jorge López-Moreno
DiffM
83
4
0
19 Mar 2024
Biophysics Informed Pathological Regularisation for Brain Tumour
  Segmentation
Biophysics Informed Pathological Regularisation for Brain Tumour Segmentation
Lipei Zhang
Yanqi Cheng
Lihao Liu
Carola-Bibiane Schönlieb
Angelica I Aviles-Rivero
AI4CE
89
10
0
14 Mar 2024
Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
Sayantan Choudhury
N. Tupitsa
Nicolas Loizou
Samuel Horváth
Martin Takáč
Eduard A. Gorbunov
133
1
0
05 Mar 2024
12345678
Next