Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.14717
Cited By
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
26 September 2023
Yuhui Xu
Lingxi Xie
Xiaotao Gu
Xin Chen
Heng Chang
Hengheng Zhang
Zhensu Chen
Xiaopeng Zhang
Qi Tian
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models"
21 / 21 papers shown
Title
PaCA: Partial Connection Adaptation for Efficient Fine-Tuning
Sunghyeon Woo
Sol Namkung
Sunwoo Lee
Inho Jeong
Beomseok Kim
Dongsuk Jeon
39
0
0
28 Feb 2025
GaLore
+
+
+
: Boosting Low-Rank Adaptation for LLMs with Cross-Head Projection
Xutao Liao
Shaohui Li
Yuhui Xu
Zhi Li
Y. Liu
You He
VLM
59
2
0
31 Dec 2024
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation
Shih-yang Liu
Huck Yang
Nai Chit Fung
Nai Chit Fung
Hongxu Yin
...
Jan Kautz
Yu-Chun Wang
Pavlo Molchanov
Min-Hung Chen
Min-Hung Chen
MQ
31
0
0
28 Oct 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
L. Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
49
3
0
24 Oct 2024
Fine-tuning and Prompt Engineering with Cognitive Knowledge Graphs for Scholarly Knowledge Organization
Gollam Rabby
Sören Auer
Jennifer D'Souza
A. Oelen
127
2
0
10 Sep 2024
Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models
Zi Liang
Haibo Hu
Qingqing Ye
Yaxin Xiao
Haoyang Li
AAML
ELM
SILM
48
6
0
05 Aug 2024
ThinK: Thinner Key Cache by Query-Driven Pruning
Yuhui Xu
Zhanming Jie
Hanze Dong
Lei Wang
Xudong Lu
Aojun Zhou
Amrita Saha
Caiming Xiong
Doyen Sahoo
72
14
0
30 Jul 2024
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Mengzhao Chen
Wenqi Shao
Peng Xu
Jiahao Wang
Peng Gao
Kaipeng Zhang
Yu Qiao
MQ
46
24
0
10 Jul 2024
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
Xingrun Xing
Boyan Gao
Zheng Zhang
David A. Clifton
Shitao Xiao
LI DU
Guoqi Li
Jiajun Zhang
55
5
0
05 Jul 2024
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning
Yibo Yang
Xiaojie Li
Zhongzhu Zhou
Shuaiwen Leon Song
Jianlong Wu
Liqiang Nie
Guohao Li
45
6
0
07 Jun 2024
LCQ: Low-Rank Codebook based Quantization for Large Language Models
Wen-Pu Cai
Wu-Jun Li
Wu-Jun Li
MQ
46
0
0
31 May 2024
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
Xing Hu
Yuan Cheng
Dawei Yang
Zhihang Yuan
Jiangyong Yu
Chen Xu
Sifan Zhou
MQ
36
7
0
28 May 2024
An empirical study of LLaMA3 quantization: from LLMs to MLLMs
Wei Huang
Xingyu Zheng
Xudong Ma
Haotong Qin
Chengtao Lv
Hong Chen
Jie Luo
Xiaojuan Qi
Xianglong Liu
Michele Magno
MQ
59
38
0
22 Apr 2024
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
Fanxu Meng
Zhaohui Wang
Muhan Zhang
VLM
64
73
0
03 Apr 2024
Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks
Tingyu Qu
Tinne Tuytelaars
Marie-Francine Moens
MoE
46
2
0
14 Mar 2024
DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
Sunghyeon Woo
Baeseong Park
Byeongwook Kim
Minjung Jo
S. Kwon
Dongsuk Jeon
Dongsoo Lee
65
2
0
27 Feb 2024
Zero-shot Generative Large Language Models for Systematic Review Screening Automation
Shuai Wang
Harrisen Scells
Shengyao Zhuang
Martin Potthast
Bevan Koopman
Guido Zuccon
30
12
0
12 Jan 2024
NOLA: Compressing LoRA using Linear Combination of Random Basis
Soroush Abbasi Koohpayegani
K. Navaneet
Parsa Nooralinejad
Soheil Kolouri
Hamed Pirsiavash
40
12
0
04 Oct 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
298
2,232
0
22 Mar 2023
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
250
1,073
0
05 Oct 2022
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
233
576
0
12 Sep 2019
1