Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.09509
Cited By
ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba
12 March 2025
Juncan Deng
Shuaiting Li
Zeyu Wang
Kedong Xu
Hong Gu
Kejie Huang
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba"
31 / 31 papers shown
Title
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
306
706
0
31 Dec 2024
PTQ4VM: Post-Training Quantization for Visual Mamba
Jun-gyu Jin
Changhun Lee
Seonggon Kim
Eunhyeok Park
MQ
Mamba
115
2
0
29 Dec 2024
VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Yifei Liu
Jicheng Wen
Yang Wang
Shengyu Ye
Li Lyna Zhang
Ting Cao
Cheng Li
Mao Yang
MQ
218
16
0
25 Sep 2024
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Chenhongyi Yang
Zehui Chen
Miguel Espinosa
Linus Ericsson
Zhenyu Wang
Jiaming Liu
Elliot J. Crowley
Mamba
96
98
0
26 Mar 2024
LocalMamba: Visual State Space Model with Windowed Selective Scan
Tao Huang
Xiaohuan Pei
Shan You
Fei Wang
Chao Qian
Chang Xu
Mamba
90
154
0
14 Mar 2024
GPTVQ: The Blessing of Dimensionality for LLM Quantization
M. V. Baalen
Andrey Kuzmin
Ivan Koryakovskiy
Markus Nagel
Peter Couperus
Cédric Bastoul
E. Mahurin
Tijmen Blankevoort
Paul N. Whatmough
MQ
94
35
0
23 Feb 2024
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
Wenqi Shao
Mengzhao Chen
Zhaoyang Zhang
Peng Xu
Lirui Zhao
Zhiqiang Li
Kaipeng Zhang
Peng Gao
Yu Qiao
Ping Luo
MQ
94
204
0
25 Aug 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDL
MQ
104
578
0
01 Jun 2023
Post-training Quantization on Diffusion Models
Yuzhang Shang
Zhihang Yuan
Bin Xie
Bingzhe Wu
Yan Yan
DiffM
MQ
132
182
0
28 Nov 2022
Learning Low-Rank Representations for Model Compression
Zezhou Zhu
Yucong Zhou
Zhaobai Zhong
SSL
MQ
54
3
0
21 Nov 2022
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Elias Frantar
Saleh Ashkboos
Torsten Hoefler
Dan Alistarh
MQ
152
1,008
0
31 Oct 2022
On the Parameterization and Initialization of Diagonal State Space Models
Albert Gu
Ankit Gupta
Karan Goel
Christopher Ré
91
324
0
23 Jun 2022
Efficiently Modeling Long Sequences with Structured State Spaces
Albert Gu
Karan Goel
Christopher Ré
217
1,829
0
31 Oct 2021
Post-Training Quantization for Vision Transformer
Zhenhua Liu
Yunhe Wang
Kai Han
Siwei Ma
Wen Gao
ViT
MQ
108
343
0
27 Jun 2021
Network Quantization with Element-wise Gradient Scaling
Junghyup Lee
Dohyung Kim
Bumsub Ham
MQ
76
120
0
02 Apr 2021
BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction
Yuhang Li
Ruihao Gong
Xu Tan
Yang Yang
Peng Hu
Qi Zhang
F. Yu
Wei Wang
Shi Gu
MQ
149
444
0
10 Feb 2021
Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks
Julieta Martinez
Jashan Shewakramani
Ting Liu
Ioan Andrei Bârsan
Wenyuan Zeng
R. Urtasun
MQ
68
31
0
29 Oct 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
682
41,483
0
22 Oct 2020
Up or Down? Adaptive Rounding for Post-Training Quantization
Markus Nagel
Rana Ali Amjad
M. V. Baalen
Christos Louizos
Tijmen Blankevoort
MQ
95
588
0
22 Apr 2020
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
Zhen Dong
Z. Yao
Yaohui Cai
Daiyaan Arfeen
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
95
282
0
10 Nov 2019
And the Bit Goes Down: Revisiting the Quantization of Neural Networks
Pierre Stock
Armand Joulin
Rémi Gribonval
Benjamin Graham
Hervé Jégou
MQ
99
149
0
12 Jul 2019
Learned Step Size Quantization
S. K. Esser
J. McKinstry
Deepika Bablani
R. Appuswamy
D. Modha
MQ
75
810
0
21 Feb 2019
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
Kuan-Chieh Wang
Zhijian Liu
Chengyue Wu
Ji Lin
Song Han
MQ
129
884
0
21 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,229
0
11 Oct 2018
Deep
k
k
k
-Means: Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions
Junru Wu
Yue Wang
Zhenyu Wu
Zhangyang Wang
Ashok Veeraraghavan
Yingyan Lin
59
115
0
24 Jun 2018
Quantizing deep convolutional networks for efficient inference: A whitepaper
Raghuraman Krishnamoorthi
MQ
141
1,021
0
21 Jun 2018
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
803
132,454
0
12 Jun 2017
Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights
Aojun Zhou
Anbang Yao
Yiwen Guo
Lin Xu
Yurong Chen
MQ
399
1,055
0
10 Feb 2017
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
194,510
0
10 Dec 2015
An Introduction to Convolutional Neural Networks
K. O’Shea
Ryan Nash
FaML
HAI
85
3,159
0
26 Nov 2015
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han
Huizi Mao
W. Dally
3DGS
263
8,862
0
01 Oct 2015
1