Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.17066
Cited By
VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
25 September 2024
Yifei Liu
Jicheng Wen
Yang Wang
Shengyu Ye
Li Lyna Zhang
Ting Cao
Cheng Li
Mao Yang
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models"
9 / 9 papers shown
Title
Fine-tuning Quantized Neural Networks with Zeroth-order Optimization
Sifeng Shang
Jiayi Zhou
Chenyu Lin
Minxian Li
Kaiyang Zhou
MQ
7
0
0
19 May 2025
RWKVQuant: Quantizing the RWKV Family with Proxy Guided Hybrid of Scalar and Vector Quantization
Chen Xu
Yuxuan Yue
Zukang Xu
Xing Hu
Jiangyong Yu
Zhixuan Chen
Sifan Zhou
Zhihang Yuan
Dawei Yang
MQ
32
0
0
02 May 2025
NoWag: A Unified Framework for Shape Preserving Compression of Large Language Models
Lawrence Liu
Inesh Chakrabarti
Yixiao Li
Mengdi Wang
Tuo Zhao
Lin F. Yang
MQ
37
0
0
20 Apr 2025
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
Yamato Arai
Yuma Ichikawa
MQ
34
0
0
13 Apr 2025
QUAD: Quantization and Parameter-Efficient Tuning of LLM with Activation Decomposition
Yuxuan Hu
Xiaodong Chen
C. Li
Hongyu Chen
J. Zhang
MQ
60
0
0
25 Mar 2025
PIPO: Pipelined Offloading for Efficient Inference on Consumer Devices
Yangyijian Liu
Jun Yu Li
Wu-Jun Li
36
0
0
15 Mar 2025
ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba
Juncan Deng
Shuaiting Li
Zeyu Wang
Kedong Xu
Hong Gu
Kejie Huang
MQ
60
0
0
12 Mar 2025
Aligned Vector Quantization for Edge-Cloud Collabrative Vision-Language Models
Xiao Liu
Lijun Zhang
Deepak Ganesan
Hui Guan
VLM
33
0
0
08 Nov 2024
Pyramid Vector Quantization for LLMs
Tycho F. A. van der Ouderaa
Maximilian L. Croci
Agrin Hilmkil
James Hensman
MQ
34
1
0
22 Oct 2024
1