Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1907.08610
Cited By
v1
v2 (latest)
Lookahead Optimizer: k steps forward, 1 step back
19 July 2019
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Lookahead Optimizer: k steps forward, 1 step back"
50 / 357 papers shown
Title
DGSAM: Domain Generalization via Individual Sharpness-Aware Minimization
Youngjun Song
Youngsik Hwang
Jonghun Lee
Heechang Lee
Dong-Young Lim
AAML
124
0
0
01 Jul 2025
Deepfake Caricatures: Amplifying attention to artifacts increases deepfake detection by humans and machines
Camilo Luciano Fosco
Emilie Josephs
A. Andonian
Allen Lee
97
4
0
01 Jul 2025
Quantum-Inspired Differentiable Integral Neural Networks (QIDINNs): A Feynman-Based Architecture for Continuous Learning Over Streaming Data
Oscar Boullosa Dapena
24
0
0
13 Jun 2025
NoLoCo: No-all-reduce Low Communication Training Method for Large Models
Jari Kolehmainen
Nikolay Blagoev
John Donaghy
Oğuzhan Ersoy
Christopher Nies
108
0
0
12 Jun 2025
Local Equivariance Error-Based Metrics for Evaluating Sampling-Frequency-Independent Property of Neural Network
Kanami Imamura
Tomohiko Nakamura
Norihiro Takamune
Kohei Yatabe
Hiroshi Saruwatari
64
0
0
04 Jun 2025
Investigating Mask-aware Prototype Learning for Tabular Anomaly Detection
Ruiying Lu
Jinhan Liu
Chuan Du
D. Guo
OOD
AAML
61
0
0
03 Jun 2025
Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization
C. Tan
Yubo Zhou
Haishan Ye
Guang Dai
Junmin Liu
Zengjie Song
Jiangshe Zhang
Zixiang Zhao
Yunda Hao
Yong Xu
AAML
40
0
0
29 May 2025
Gradient Flow Matching for Learning Update Dynamics in Neural Network Training
Xiao Shou
Yanna Ding
Jianxi Gao
24
0
0
26 May 2025
Accelerating Learned Image Compression Through Modeling Neural Training Dynamics
Yichi Zhang
Zhihao Duan
Yuning Huang
Fengqing Zhu
237
0
0
23 May 2025
A Unified Gradient-based Framework for Task-agnostic Continual Learning-Unlearning
Zhehao Huang
Xinwen Cheng
Jing Zhang
Jinghao Zheng
Haoran Wang
Zhengbao He
Tao Li
Xiaolin Huang
CLL
85
1
0
21 May 2025
Enhancing Certified Robustness via Block Reflector Orthogonal Layers and Logit Annealing Loss
Bo-Han Lai
Pin-Han Huang
Bo-Han Kung
Shang-Tse Chen
70
0
0
21 May 2025
Learning by solving differential equations
Benoit Dherin
Michael Munn
Hanna Mazzawi
Michael Wunder
Sourabh Medapati
Javier Gonzalvo
60
0
0
19 May 2025
A novel Neural-ODE model for the state of health estimation of lithium-ion battery using charging curve
Yiming Li
Man He
Jing Liu
53
0
0
09 May 2025
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
Zachary B. Charles
Gabriel Teston
Lucio Dery
Keith Rush
Nova Fallen
Zachary Garrett
Arthur Szlam
Arthur Douillard
459
6
0
12 Mar 2025
Hierarchical Semantic Compression for Consistent Image Semantic Restoration
Shengxi Li
Zifu Zhang
Mai Xu
Lai Jiang
Yufan Liu
Ce Zhu
DiffM
76
0
0
24 Feb 2025
Carefully Blending Adversarial Training, Purification, and Aggregation Improves Adversarial Robustness
Emanuele Ballarin
A. Ansuini
Luca Bortolussi
AAML
184
0
0
20 Feb 2025
ReFlow6D: Refraction-Guided Transparent Object 6D Pose Estimation via Intermediate Representation Learning
Hrishikesh Gupta
S. Thalhammer
Jean-Baptiste Weibel
Alexander Haberl
Markus Vincze
106
0
0
31 Dec 2024
A Unified Analysis of Federated Learning with Arbitrary Client Participation
Shiqiang Wang
Mingyue Ji
FedML
148
59
0
31 Dec 2024
Transformer-based toxin-protein interaction analysis prioritizes airborne particulate matter components with potential adverse health effects
Yan Zhu
Shihao Wang
Yong Han
Yao Lu
Shulan Qiu
Ling Jin
Xiangdong Li
Weixiong Zhang
115
1
0
21 Dec 2024
Distributed Sign Momentum with Local Steps for Training Transformers
Shuhua Yu
Ding Zhou
Cong Xie
An Xu
Zhi-Li Zhang
Xin Liu
S. Kar
141
0
0
26 Nov 2024
Retinal Vessel Segmentation via Neuron Programming
Tingting Wu
Ruyi Min
Peixuan Song
Hengtao Guo
Tieyong Zeng
Feng-Lei Fan
81
0
0
17 Nov 2024
ParaGAN: A Scalable Distributed Training Framework for Generative Adversarial Networks
Ziji Shi
Jialin Li
Yang You
56
1
0
06 Nov 2024
OledFL: Unleashing the Potential of Decentralized Federated Learning via Opposite Lookahead Enhancement
Qinglun Li
Miao Zhang
Mengzhu Wang
Quanjun Yin
Li Shen
OODD
FedML
64
0
0
09 Oct 2024
MECFormer: Multi-task Whole Slide Image Classification with Expert Consultation Network
Doanh C. Bui
Jin Tae Kwak
DiffM
MedIm
51
0
0
06 Oct 2024
Image First or Text First? Optimising the Sequencing of Modalities in Large Language Model Prompting and Reasoning Tasks
Grant Wardle
Teo Susnjak
LRM
110
6
0
04 Oct 2024
Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images
Bahri Batuhan Bilecen
Ahmet Berke Gokmen
Aysegül Dündar
84
2
0
30 Sep 2024
Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement
Zhehao Huang
Xinwen Cheng
Jinghao Zheng
Haoran Wang
Zhengbao He
Tao Li
Xiaolin Huang
MU
110
9
0
29 Sep 2024
DetectBERT: Towards Full App-Level Representation Learning to Detect Android Malware
Tiezhu Sun
N. Daoudi
Kisub Kim
Kevin Allix
Tegawende F. Bissyande
Jacques Klein
64
3
0
29 Aug 2024
Decentralized Federated Learning with Model Caching on Mobile Agents
Xiaoyu Wang
Guojun Xiong
Houwei Cao
Jian Li
Yong Liu
FedML
94
1
0
26 Aug 2024
Rethinking Pre-Trained Feature Extractor Selection in Multiple Instance Learning for Whole Slide Image Classification
Bryan Wong
MunYong Yi
Mun Yong Yi
VLM
120
0
0
02 Aug 2024
Monocular pose estimation of articulated surgical instruments in open surgery
Robert Spektor
Tom Friedman
Itay Or
Gil Bolotin
S. Laufer
93
0
0
16 Jul 2024
Latent Space Imaging
Matheus Souza
Yidan Zheng
Kaizhang Kang
Yogeshwar Nath Mishra
Qiang Fu
Wolfgang Heidrich
143
0
0
09 Jul 2024
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification
Wenhui Zhu
Xiwen Chen
Peijie Qiu
Aristeidis Sotiras
Abolfazl Razi
Yalin Wang
82
7
0
04 Jul 2024
PathoWAve: A Deep Learning-based Weight Averaging Method for Improving Domain Generalization in Histopathology Images
Parastoo Sotoudeh Sharifi
M. Omair Ahmad
M. N. S. Swamy
MoMe
OOD
96
0
0
21 Jun 2024
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
Jason D. Lee
Kazusato Oko
Taiji Suzuki
Denny Wu
MLT
149
25
0
03 Jun 2024
The Road Less Scheduled
Aaron Defazio
Xingyu Yang
Yang
Harsh Mehta
Konstantin Mishchenko
Ahmed Khaled
Ashok Cutkosky
120
60
0
24 May 2024
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Luca Pesce
Ludovic Stephan
141
18
0
24 May 2024
Prior-guided Diffusion Model for Cell Segmentation in Quantitative Phase Imaging
Zhuchen Shao
M. Anastasio
Hua Li
DiffM
MedIm
59
1
0
10 May 2024
TimeMIL: Advancing Multivariate Time Series Classification via a Time-aware Multiple Instance Learning
Xiwen Chen
Peijie Qiu
Wenhui Zhu
Huayu Li
Hao Wang
Aristeidis Sotiras
Yalin Wang
Abolfazl Razi
AI4TS
68
9
0
06 May 2024
Image segmentation of treated and untreated tumor spheroids by Fully Convolutional Networks
Matthias Streller
S. Michlíková
Willy Ciecior
Katharina Lönnecke
L. Kunz-Schughart
Steffen Lange
Anja Voss-Böhme
123
1
0
02 May 2024
FisheyeDetNet: 360° Surround view Fisheye Camera based Object Detection System for Autonomous Driving
Ganesh Sistu
S. Yogamani
90
0
0
20 Apr 2024
RaSim: A Range-aware High-fidelity RGB-D Data Simulation Pipeline for Real-world Applications
Xingyu Liu
Chenyangguang Zhang
Gu Wang
Ruida Zhang
Xiangyang Ji
69
1
0
05 Apr 2024
94% on CIFAR-10 in 3.29 Seconds on a Single GPU
Keller Jordan
VLM
70
5
0
30 Mar 2024
Revisiting Random Weight Perturbation for Efficiently Improving Generalization
Tao Li
Qinghua Tao
Weihao Yan
Zehao Lei
Yingwen Wu
Kun Fang
Mingzhen He
Xiaolin Huang
AAML
114
6
0
30 Mar 2024
All-in-One: Heterogeneous Interaction Modeling for Cold-Start Rating Prediction
Shuheng Fang
Kangfei Zhao
Yu Rong
Zhixun Li
Jeffrey Xu Yu
101
0
0
26 Mar 2024
Predicting Perceived Gloss: Do Weak Labels Suffice?
Julia Guerrero-Viu
J. Daniel Subias
Ana Serrano
Katherine R. Storrs
Roland W. Fleming
B. Masiá
Diego F. F. Gutierrez
77
2
0
26 Mar 2024
FedMIL: Federated-Multiple Instance Learning for Video Analysis with Optimized DPP Scheduling
Ashish Bastola
Hao Wang
Xiwen Chen
Abolfazl Razi
68
1
0
26 Mar 2024
TexTile: A Differentiable Metric for Texture Tileability
Carlos Rodriguez-Pardo
Dan Casas
Elena Garces
Jorge López-Moreno
DiffM
83
4
0
19 Mar 2024
Biophysics Informed Pathological Regularisation for Brain Tumour Segmentation
Lipei Zhang
Yanqi Cheng
Lihao Liu
Carola-Bibiane Schönlieb
Angelica I Aviles-Rivero
AI4CE
89
10
0
14 Mar 2024
Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
Sayantan Choudhury
N. Tupitsa
Nicolas Loizou
Samuel Horváth
Martin Takáč
Eduard A. Gorbunov
133
1
0
05 Mar 2024
1
2
3
4
5
6
7
8
Next