Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.08659
Cited By
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models
12 October 2023
Yixiao Li
Yifan Yu
Chen Liang
Pengcheng He
Nikos Karampatziakis
Weizhu Chen
Tuo Zhao
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models"
32 / 32 papers shown
Title
Communication-Efficient Hybrid Language Model via Uncertainty-Aware Opportunistic and Compressed Transmission
Seungeun Oh
Jinhyuk Kim
Jihong Park
Seung-Woo Ko
Jinho Choi
Tony Q. S. Quek
Seong-Lyun Kim
2
0
0
17 May 2025
Efficient Fine-Tuning of Quantized Models via Adaptive Rank and Bitwidth
Changhai Zhou
Yuhua Zhou
Qian Qiao
Weizhong Zhang
Cheng Jin
MQ
27
0
0
02 May 2025
GOLLuM: Gaussian Process Optimized LLMs -- Reframing LLM Finetuning through Bayesian Optimization
Bojana Ranković
P. Schwaller
BDL
175
0
0
08 Apr 2025
FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization
Hao Chen
S. Hu
Wayne Luk
Timothy M. Hospedales
Hongxiang Fan
MoMe
72
0
0
16 Mar 2025
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
Sifan Zhou
Shuo Wang
Zhihang Yuan
Mingjia Shi
Yuzhang Shang
Dawei Yang
ALM
MQ
90
0
0
18 Feb 2025
An Efficient Row-Based Sparse Fine-Tuning
Cen-Jhih Li
Aditya Bhaskara
56
0
0
17 Feb 2025
LowRA: Accurate and Efficient LoRA Fine-Tuning of LLMs under 2 Bits
Zikai Zhou
Qizheng Zhang
Hermann Kumbong
Kunle Olukotun
MQ
249
0
0
12 Feb 2025
A Survey of Graph Retrieval-Augmented Generation for Customized Large Language Models
Qinggang Zhang
Shengyuan Chen
Yuanchen Bei
Zheng Yuan
Huachi Zhou
Zijin Hong
Junnan Dong
Hao-Heng Chen
Yi-Ju Chang
Xiao Huang
3DV
70
7
0
21 Jan 2025
FlexQuant: Elastic Quantization Framework for Locally Hosted LLM on Edge Devices
Yuji Chai
Mujin Kwen
David Brooks
Gu-Yeon Wei
MQ
44
3
0
13 Jan 2025
Uncertainty-Aware Hybrid Inference with On-Device Small and Remote Large Language Models
Seungeun Oh
Jinhyuk Kim
Jihong Park
Seung-Woo Ko
Tony Q. S. Quek
Seong-Lyun Kim
77
5
0
17 Dec 2024
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation
Shih-yang Liu
Huck Yang
Nai Chit Fung
Nai Chit Fung
Hongxu Yin
...
Jan Kautz
Yu-Chun Wang
Pavlo Molchanov
Min-Hung Chen
Min-Hung Chen
MQ
31
0
0
28 Oct 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
L. Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
49
3
0
24 Oct 2024
Mitigating Forgetting in LLM Supervised Fine-Tuning and Preference Learning
H. Fernando
Han Shen
Parikshit Ram
Yi Zhou
Horst Samulowitz
Nathalie Baracaldo
Tianyi Chen
CLL
56
2
0
20 Oct 2024
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Yonghong Tian
Wenqi Shao
Peng Xu
Jiahao Wang
Peng Gao
Kaipeng Zhang
Yu Qiao
MQ
46
24
0
10 Jul 2024
Composable Interventions for Language Models
Arinbjorn Kolbeinsson
Kyle O'Brien
Tianjin Huang
Shanghua Gao
Shiwei Liu
...
Anurag J. Vaidya
Faisal Mahmood
Marinka Zitnik
Tianlong Chen
Thomas Hartvigsen
KELM
MU
89
5
0
09 Jul 2024
Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE
Florence Regol
Joud Chataoui
Bertrand Charpentier
Mark J. Coates
Pablo Piantanida
Stephan Gunnemann
45
0
0
20 Jun 2024
The Impact of Initialization on LoRA Finetuning Dynamics
Soufiane Hayou
Nikhil Ghosh
Bin Yu
AI4CE
36
11
0
12 Jun 2024
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning
Yibo Yang
Xiaojie Li
Zhongzhu Zhou
Shuaiwen Leon Song
Jianlong Wu
Liqiang Nie
Guohao Li
45
6
0
07 Jun 2024
LCQ: Low-Rank Codebook based Quantization for Large Language Models
Wen-Pu Cai
Wu-Jun Li
Wu-Jun Li
MQ
46
0
0
31 May 2024
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
Xing Hu
Yuan Cheng
Dawei Yang
Zhihang Yuan
Jiangyong Yu
Chen Xu
Sifan Zhou
MQ
36
7
0
28 May 2024
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
Fanxu Meng
Zhaohui Wang
Muhan Zhang
VLM
64
73
0
03 Apr 2024
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov
Kushal Tirumala
Hassan Shapourian
Paolo Glorioso
Daniel A. Roberts
52
79
0
26 Mar 2024
LoRA+: Efficient Low Rank Adaptation of Large Models
Soufiane Hayou
Nikhil Ghosh
Bin Yu
AI4CE
37
141
0
19 Feb 2024
Institutional Platform for Secure Self-Service Large Language Model Exploration
V. Bumgardner
Mitchell A. Klusty
W. V. Logan
Samuel E. Armstrong
Caylin D. Hickey
Jeff Talbert
Caylin Hickey
Jeff Talbert
56
1
0
01 Feb 2024
LQ-LoRA: Low-rank Plus Quantized Matrix Decomposition for Efficient Language Model Finetuning
Han Guo
P. Greengard
Eric P. Xing
Yoon Kim
MQ
36
43
0
20 Nov 2023
Audio Editing with Non-Rigid Text Prompts
Francesco Paissan
Luca Della Libera
Zhepei Wang
Mirco Ravanelli
Paris Smaragdis
Cem Subakan
DiffM
46
5
0
19 Oct 2023
Mobile Foundation Model as Firmware
Jinliang Yuan
Chenchen Yang
Dongqi Cai
Shihe Wang
Xin Yuan
...
Di Zhang
Hanzi Mei
Xianqing Jia
Shangguang Wang
Mengwei Xu
40
19
0
28 Aug 2023
Towards Efficient Post-training Quantization of Pre-trained Language Models
Haoli Bai
Lu Hou
Lifeng Shang
Xin Jiang
Irwin King
M. Lyu
MQ
79
47
0
30 Sep 2021
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
142
221
0
31 Dec 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
233
576
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
Teaching Machines to Read and Comprehend
Karl Moritz Hermann
Tomás Kociský
Edward Grefenstette
L. Espeholt
W. Kay
Mustafa Suleyman
Phil Blunsom
184
3,510
0
10 Jun 2015
1