ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.11592
  4. Cited By
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM
  Fine-Tuning: A Benchmark

Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark

18 February 2024
Yihua Zhang
Pingzhi Li
Junyuan Hong
Jiaxiang Li
Yimeng Zhang
Wenqing Zheng
Pin-Yu Chen
Jason D. Lee
Wotao Yin
Mingyi Hong
Zhangyang Wang
Sijia Liu
Tianlong Chen
ArXivPDFHTML

Papers citing "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark"

42 / 42 papers shown
Title
Stochastic Subspace Descent Accelerated via Bi-fidelity Line Search
Stochastic Subspace Descent Accelerated via Bi-fidelity Line Search
Nuojin Cheng
Alireza Doostan
Stephen Becker
39
0
0
30 Apr 2025
Perturbation-efficient Zeroth-order Optimization for Hardware-friendly On-device Training
Perturbation-efficient Zeroth-order Optimization for Hardware-friendly On-device Training
Qitao Tan
Sung-En Chang
Rui Xia
Huidong Ji
Chence Yang
...
Zheng Zhan
Zhou Zou
Y. Wang
Jin Lu
Geng Yuan
41
0
0
28 Apr 2025
PR-Attack: Coordinated Prompt-RAG Attacks on Retrieval-Augmented Generation in Large Language Models via Bilevel Optimization
PR-Attack: Coordinated Prompt-RAG Attacks on Retrieval-Augmented Generation in Large Language Models via Bilevel Optimization
Yang Jiao
X. Wang
Kai Yang
AAML
SILM
33
0
0
10 Apr 2025
ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory
ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory
Liangyu Wang
Jie Ren
Hang Xu
Junxiao Wang
Huanyi Xie
David E. Keyes
Di Wang
60
0
0
16 Mar 2025
Visualising Policy-Reward Interplay to Inform Zeroth-Order Preference Optimisation of Large Language Models
Alessio Galatolo
Zhenbang Dai
Katie Winkle
Meriem Beloucif
53
0
0
05 Mar 2025
QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models
QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models
Jiajun Zhou
Yifan Yang
Kai Zhen
Z. Liu
Yequan Zhao
Ershad Banijamali
Athanasios Mouchtaris
Ngai Wong
Zheng Zhang
MQ
41
0
0
17 Feb 2025
MaZO: Masked Zeroth-Order Optimization for Multi-Task Fine-Tuning of Large Language Models
MaZO: Masked Zeroth-Order Optimization for Multi-Task Fine-Tuning of Large Language Models
Zhen Zhang
Yuqing Yang
Kai Zhen
Nathan Susanj
Athanasios Mouchtaris
Siegfried Kunzmann
Zheng Zhang
54
0
0
17 Feb 2025
Scalable Back-Propagation-Free Training of Optical Physics-Informed Neural Networks
Scalable Back-Propagation-Free Training of Optical Physics-Informed Neural Networks
Yequan Zhao
Xinling Yu
Xian Xiao
Zhengzhang Chen
Z. Liu
G. Kurczveil
R. Beausoleil
S. Liu
Z. Zhang
56
0
0
17 Feb 2025
ElasticZO: A Memory-Efficient On-Device Learning with Combined Zeroth- and First-Order Optimization
ElasticZO: A Memory-Efficient On-Device Learning with Combined Zeroth- and First-Order Optimization
Keisuke Sugiura
Hiroki Matsutani
MQ
36
1
0
08 Jan 2025
COAP: Memory-Efficient Training with Correlation-Aware Gradient Projection
Jinqi Xiao
S. Sang
Tiancheng Zhi
Jing Liu
Qing Yan
Linjie Luo
Bo Yuan
Bo Yuan
VLM
86
1
0
26 Nov 2024
Poor Man's Training on MCUs: A Memory-Efficient Quantized
  Back-Propagation-Free Approach
Poor Man's Training on MCUs: A Memory-Efficient Quantized Back-Propagation-Free Approach
Yequan Zhao
Hai Li
Ian Young
Zheng-Wei Zhang
MQ
37
2
0
07 Nov 2024
On the Crucial Role of Initialization for Matrix Factorization
On the Crucial Role of Initialization for Matrix Factorization
Bingcong Li
Liang Zhang
Aryan Mokhtari
Niao He
28
1
0
24 Oct 2024
Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for
  Fine-Tuning Large Language Models
Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for Fine-Tuning Large Language Models
Fei Wang
Li Shen
Liang Ding
Chao Xue
Ye Liu
Changxing Ding
32
0
0
13 Oct 2024
Zeroth-Order Fine-Tuning of LLMs in Random Subspaces
Zeroth-Order Fine-Tuning of LLMs in Random Subspaces
Ziming Yu
Pan Zhou
Sike Wang
Jia Li
Hua Huang
31
1
0
11 Oct 2024
Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and
  Performance of SGD for Fine-Tuning Language Models
Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Zeman Li
Xinwei Zhang
Peilin Zhong
Yuan Deng
Meisam Razaviyayn
Vahab Mirrokni
25
2
0
09 Oct 2024
FLOPS: Forward Learning with OPtimal Sampling
FLOPS: Forward Learning with OPtimal Sampling
Tao Ren
Zishi Zhang
Jinyang Jiang
Guanghao Li
Zeliang Zhang
Mingqian Feng
Yijie Peng
35
1
0
08 Oct 2024
Unifying back-propagation and forward-forward algorithms through model
  predictive control
Unifying back-propagation and forward-forward algorithms through model predictive control
Lianhai Ren
Qianxiao Li
31
1
0
29 Sep 2024
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference
Qining Zhang
Lei Ying
OffRL
37
2
0
25 Sep 2024
A Historical Trajectory Assisted Optimization Method for Zeroth-Order
  Federated Learning
A Historical Trajectory Assisted Optimization Method for Zeroth-Order Federated Learning
Chenlin Wu
Xiaoyu He
Zike Li
Zibin Zheng
Zibin Zheng
FedML
24
0
0
24 Sep 2024
RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large
  Language Models
RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models
Haoyu Chen
Wenbo Li
Jinjin Gu
Jingjing Ren
Sixiang Chen
Tian-Chun Ye
Renjing Pei
Kaiwen Zhou
Fenglong Song
Lei Zhu
OffRL
44
10
0
25 Jul 2024
Memory-Efficient Gradient Unrolling for Large-Scale Bi-level
  Optimization
Memory-Efficient Gradient Unrolling for Large-Scale Bi-level Optimization
Qianli Shen
Yezhen Wang
Zhouhao Yang
Xiang Li
Haonan Wang
Yang Zhang
Jonathan Scarlett
Zhanxing Zhu
Kenji Kawaguchi
AI4CE
69
4
0
20 Jun 2024
Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity
Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity
Wentao Guo
Jikai Long
Yimeng Zeng
Zirui Liu
Xinyu Yang
...
Osbert Bastani
Christopher De Sa
Xiaodong Yu
Beidi Chen
Zhaozhuo Xu
31
14
0
05 Jun 2024
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
Zhe Li
Bicheng Ying
Zidong Liu
Haibo Yang
Haibo Yang
FedML
59
3
0
24 May 2024
Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
Tanmay Gautam
Youngsuk Park
Hao Zhou
Parameswaran Raman
Wooseok Ha
43
11
0
11 Apr 2024
The Power of Few: Accelerating and Enhancing Data Reweighting with
  Coreset Selection
The Power of Few: Accelerating and Enhancing Data Reweighting with Coreset Selection
Mohammad Jafari
Yimeng Zhang
Yihua Zhang
Sijia Liu
38
2
0
18 Mar 2024
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
Yanjun Zhao
Sizhe Dang
Haishan Ye
Guang Dai
Yi Qian
Ivor W.Tsang
66
8
0
23 Feb 2024
Differentially Private Zeroth-Order Methods for Scalable Large Language
  Model Finetuning
Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning
Zhicheng Liu
Jian Lou
W. Bao
Yihan Hu
Baochun Li
Zhanyue Qin
K. Ren
29
7
0
12 Feb 2024
Private Fine-tuning of Large Language Models with Zeroth-order Optimization
Private Fine-tuning of Large Language Models with Zeroth-order Optimization
Xinyu Tang
Ashwinee Panda
Milad Nasr
Saeed Mahloujifar
Prateek Mittal
47
18
0
09 Jan 2024
DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training
DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training
Aochuan Chen
Yimeng Zhang
Jinghan Jia
James Diffenderfer
Jiancheng Liu
Konstantinos Parasyris
Yihua Zhang
Zheng-Wei Zhang
B. Kailkhura
Sijia Liu
30
43
0
03 Oct 2023
Scaling Forward Gradient With Local Losses
Scaling Forward Gradient With Local Losses
Mengye Ren
Simon Kornblith
Renjie Liao
Geoffrey E. Hinton
81
49
0
07 Oct 2022
Optimization without Backpropagation
Optimization without Backpropagation
Gabriel Belouze
26
7
0
13 Sep 2022
AdaptFormer: Adapting Vision Transformers for Scalable Visual
  Recognition
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
146
638
0
26 May 2022
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally
  Across Scales and Tasks
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
Xiao Liu
Kaixuan Ji
Yicheng Fu
Weng Lam Tam
Zhengxiao Du
Zhilin Yang
Jie Tang
VLM
238
806
0
14 Oct 2021
MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better
  Translators
MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators
Zhixing Tan
Xiangwen Zhang
Shuo Wang
Yang Liu
VLM
LRM
213
52
0
13 Oct 2021
Curvature-Aware Derivative-Free Optimization
Curvature-Aware Derivative-Free Optimization
Bumsu Kim
HanQin Cai
Daniel McKenzie
W. Yin
ODL
22
10
0
27 Sep 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,848
0
18 Apr 2021
Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm
Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm
Bin Gu
Guodong Liu
Yanfu Zhang
Xiang Geng
Heng-Chiao Huang
31
18
0
17 Feb 2021
ZeRO-Offload: Democratizing Billion-Scale Model Training
ZeRO-Offload: Democratizing Billion-Scale Model Training
Jie Ren
Samyam Rajbhandari
Reza Yazdani Aminabadi
Olatunji Ruwase
Shuangyang Yang
Minjia Zhang
Dong Li
Yuxiong He
MoE
177
414
0
18 Jan 2021
Making Pre-trained Language Models Better Few-shot Learners
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
241
1,919
0
31 Dec 2020
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
153
345
0
23 Jul 2020
Sign-OPT: A Query-Efficient Hard-label Adversarial Attack
Sign-OPT: A Query-Efficient Hard-label Adversarial Attack
Minhao Cheng
Simranjit Singh
Patrick H. Chen
Pin-Yu Chen
Sijia Liu
Cho-Jui Hsieh
AAML
124
219
0
24 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
1