Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1902.08153
Cited By
Learned Step Size Quantization
21 February 2019
S. K. Esser
J. McKinstry
Deepika Bablani
R. Appuswamy
D. Modha
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learned Step Size Quantization"
50 / 181 papers shown
Title
Learning from Loss Landscape: Generalizable Mixed-Precision Quantization via Adaptive Sharpness-Aware Gradient Aligning
Lianbo Ma
Jianlun Ma
Yuee Zhou
Guoyang Xie
Qiang He
Zhichao Lu
MQ
48
0
0
08 May 2025
Diffusion Model Quantization: A Review
Qian Zeng
Chenggong Hu
Mingli Song
Jie Song
MQ
53
0
0
08 May 2025
PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs
Lukas Meiner
Jens Mehnert
Alexandru Paul Condurache
MQ
44
0
0
06 May 2025
Radio: Rate-Distortion Optimization for Large Language Model Compression
Sean I. Young
MQ
36
0
0
05 May 2025
Pack-PTQ: Advancing Post-training Quantization of Neural Networks by Pack-wise Reconstruction
Changjun Li
Runqing Jiang
Zhuo Song
Pengpeng Yu
Ye Zhang
Yulan Guo
MQ
56
0
0
01 May 2025
Gradual Binary Search and Dimension Expansion : A general method for activation quantization in LLMs
Lucas Maisonnave
Cyril Moineau
Olivier Bichler
Fabrice Rastello
MQ
42
0
0
18 Apr 2025
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
Jun Zhao
Miao Zhang
Ming Wang
Yuzhang Shang
Kaihao Zhang
Weili Guan
Yaowei Wang
Min Zhang
MQ
49
0
0
18 Feb 2025
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
Sifan Zhou
Shuo Wang
Zhihang Yuan
Mingjia Shi
Yuzhang Shang
Dawei Yang
ALM
MQ
90
0
0
18 Feb 2025
LowRA: Accurate and Efficient LoRA Fine-Tuning of LLMs under 2 Bits
Zikai Zhou
Qizheng Zhang
Hermann Kumbong
Kunle Olukotun
MQ
332
0
0
12 Feb 2025
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization
Zechun Liu
Changsheng Zhao
Hanxian Huang
Sijia Chen
Jing Zhang
...
Yuandong Tian
Bilge Soran
Raghuraman Krishnamoorthi
Tijmen Blankevoort
Vikas Chandra
MQ
81
5
0
04 Feb 2025
Nearly Lossless Adaptive Bit Switching
Haiduo Huang
Zhenhua Liu
Tian Xia
Wenzhe zhao
Pengju Ren
MQ
68
0
0
03 Feb 2025
BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference
Van Thien Nguyen
William Guicquero
Gilles Sicard
3DV
MQ
82
2
0
24 Jan 2025
MOGNET: A Mux-residual quantized Network leveraging Online-Generated weights
Van Thien Nguyen
William Guicquero
Gilles Sicard
MQ
75
1
0
17 Jan 2025
Histogram-Equalized Quantization for logic-gated Residual Neural Networks
Van Thien Nguyen
William Guicquero
Gilles Sicard
MQ
46
2
0
10 Jan 2025
Dedicated Inference Engine and Binary-Weight Neural Networks for Lightweight Instance Segmentation
Tse-Wei Chen
Wei Tao
Dongyue Zhao
Kazuhiro Mima
Tadayuki Ito
Kinya Osa
Masami Kato
MQ
38
0
0
03 Jan 2025
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Mingliang Xu
Yuyao Zhou
Yuxin Zhang
Shen Li
Yong Li
Rongrong Ji
Zhanpeng Zeng
Rongrong Ji
MQ
102
0
0
31 Dec 2024
PTQ4VM: Post-Training Quantization for Visual Mamba
Younghyun Cho
Changhun Lee
Seonggon Kim
Eunhyeok Park
MQ
Mamba
53
2
0
29 Dec 2024
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Bram Adams
Ahmed E. Hassan
VLM
45
0
0
01 Nov 2024
IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models
Hang Guo
Yawei Li
Tao Dai
Shu-Tao Xia
Luca Benini
MQ
39
1
0
29 Oct 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
244
0
0
29 Oct 2024
Q-VLM: Post-training Quantization for Large Vision-Language Models
Changyuan Wang
Ziwei Wang
Xiuwei Xu
Yansong Tang
Jie Zhou
Jiwen Lu
MQ
32
2
0
10 Oct 2024
JPEG Inspired Deep Learning
Ahmed H. Salamah
Kaixiang Zheng
Yiwen Liu
En-Hui Yang
37
0
0
09 Oct 2024
QT-DoG: Quantization-aware Training for Domain Generalization
Saqib Javed
Hieu Le
Mathieu Salzmann
OOD
MQ
40
1
0
08 Oct 2024
Foundations of Large Language Model Compression -- Part 1: Weight Quantization
Sean I. Young
MQ
55
1
0
03 Sep 2024
Realizing Unaligned Block-wise Pruning for DNN Acceleration on Mobile Devices
Hayun Lee
Dongkun Shin
MQ
28
0
0
29 Jul 2024
MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity
Kanghyun Choi
Hyeyoon Lee
Dain Kwon
Sunjong Park
Kyuyeun Kim
Noseong Park
Jinho Lee
Jinho Lee
MQ
55
1
0
29 Jul 2024
Temporal Feature Matters: A Framework for Diffusion Model Quantization
Yushi Huang
Ruihao Gong
Xianglong Liu
Jing Liu
Yuhang Li
Jiwen Lu
Dacheng Tao
DiffM
MQ
49
0
0
28 Jul 2024
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Yifei Gao
Jie Ou
Lei Wang
Fanhua Shang
Jaji Wu
MQ
68
0
0
22 Jul 2024
MetaAug: Meta-Data Augmentation for Post-Training Quantization
Cuong Pham
Hoang Anh Dung
Cuong C. Nguyen
Trung Le
Dinh Q. Phung
Gustavo Carneiro
Thanh-Toan Do
MQ
46
0
0
20 Jul 2024
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices
Jung Hyun Lee
Jeonghoon Kim
J. Yang
S. Kwon
Eunho Yang
Kang Min Yoo
Dongsoo Lee
MQ
36
2
0
16 Jul 2024
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Yonghong Tian
Wenqi Shao
Peng Xu
Jiahao Wang
Peng Gao
Kaipeng Zhang
Ping Luo
MQ
58
26
0
10 Jul 2024
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
Xingrun Xing
Boyan Gao
Zheng Zhang
David A. Clifton
Shitao Xiao
Li Du
Guoqi Li
Jiajun Zhang
63
5
0
05 Jul 2024
Timestep-Aware Correction for Quantized Diffusion Models
Yuzhe Yao
Feng Tian
Jun Chen
Haonan Lin
Guang Dai
Yong Liu
Jingdong Wang
DiffM
MQ
46
5
0
04 Jul 2024
ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking
Wenshuo Li
Xinghao Chen
Han Shu
Yehui Tang
Yunhe Wang
MQ
47
2
0
17 Jun 2024
Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization
Jiaxin Deng
Junbiao Pang
Baochang Zhang
71
1
0
12 Jun 2024
xTern: Energy-Efficient Ternary Neural Network Inference on RISC-V-Based Edge Systems
Georg Rutishauser
Joan Mihali
Moritz Scherer
Luca Benini
34
1
0
29 May 2024
Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection
Yunqian Fan
Xiuying Wei
Ruihao Gong
Yuqing Ma
Xiangguo Zhang
Qi Zhang
Xianglong Liu
MQ
40
2
0
10 May 2024
AdaQAT: Adaptive Bit-Width Quantization-Aware Training
Cédric Gernigon
Silviu-Ioan Filip
Olivier Sentieys
Clément Coggiola
Mickael Bruno
28
2
0
22 Apr 2024
Instance-Aware Group Quantization for Vision Transformers
Jaehyeon Moon
Dohyung Kim
Junyong Cheon
Bumsub Ham
MQ
ViT
34
7
0
01 Apr 2024
Better Schedules for Low Precision Training of Deep Neural Networks
Cameron R. Wolfe
Anastasios Kyrillidis
47
1
0
04 Mar 2024
Boosting Neural Representations for Videos with a Conditional Decoder
Xinjie Zhang
Ren Yang
Dailan He
Xingtong Ge
Tongda Xu
Yan Wang
Hongwei Qin
Jun Zhang
43
15
0
28 Feb 2024
Effective Gradient Sample Size via Variation Estimation for Accelerating Sharpness aware Minimization
Jiaxin Deng
Junbiao Pang
Baochang Zhang
Tian Wang
48
1
0
24 Feb 2024
Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers
Junhan Kim
Kyungphil Park
Chungman Lee
Ho-Young Kim
Joonyoung Kim
Yongkweon Jeon
MQ
28
2
0
14 Feb 2024
Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward
Arnav Chavan
Raghav Magazine
Shubham Kushwaha
M. Debbah
Deepak Gupta
23
18
0
02 Feb 2024
A Novel Paradigm for Neural Computation: X-Net with Learnable Neurons and Adaptable Structure
Yanjie Li
Weijun Li
Lina Yu
Min Wu
Jinyi Liu
...
Xin Ning
Yugui Zhang
Baoli Lu
Jian Xu
Shuang Li
28
0
0
03 Jan 2024
TinySAM: Pushing the Envelope for Efficient Segment Anything Model
Han Shu
Wenshuo Li
Yehui Tang
Yiman Zhang
Yihao Chen
Houqiang Li
Yunhe Wang
Xinghao Chen
VLM
44
19
0
21 Dec 2023
ARBiBench: Benchmarking Adversarial Robustness of Binarized Neural Networks
Peng Zhao
Jiehua Zhang
Bowen Peng
Longguang Wang
Yingmei Wei
Yu Liu
Li Liu
AAML
37
0
0
21 Dec 2023
USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Shaojin Ding
David Qiu
David Rim
Yanzhang He
Oleg Rybakov
...
Tara N. Sainath
Zhonglin Han
Jian Li
Amir Yazdanbakhsh
Shivani Agrawal
MQ
34
9
0
13 Dec 2023
CBQ: Cross-Block Quantization for Large Language Models
Xin Ding
Xiaoyu Liu
Zhijun Tu
Yun-feng Zhang
Wei Li
...
Hanting Chen
Yehui Tang
Zhiwei Xiong
Baoqun Yin
Yunhe Wang
MQ
38
13
0
13 Dec 2023
SmoothQuant+: Accurate and Efficient 4-bit Post-Training WeightQuantization for LLM
Jiayi Pan
Chengcan Wang
Kaifu Zheng
Yangguang Li
Zhenyu Wang
Bin Feng
MQ
43
7
0
06 Dec 2023
1
2
3
4
Next