Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1902.08153
Cited By
Learned Step Size Quantization
21 February 2019
S. K. Esser
J. McKinstry
Deepika Bablani
R. Appuswamy
D. Modha
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learned Step Size Quantization"
50 / 181 papers shown
Title
QuantNAS for super resolution: searching for efficient quantization-friendly architectures against quantization noise
Egor Shvetsov
Dmitry Osin
Alexey Zaytsev
Ivan Koryakovskiy
Valentin Buchnev
I. Trofimov
Evgeny Burnaev
MQ
33
2
0
31 Aug 2022
Efficient Adaptive Activation Rounding for Post-Training Quantization
Zhengyi Li
Cong Guo
Zhanda Zhu
Yangjie Zhou
Yuxian Qiu
Xiaotian Gao
Jingwen Leng
Minyi Guo
MQ
34
4
0
25 Aug 2022
FP8 Quantization: The Power of the Exponent
Andrey Kuzmin
M. V. Baalen
Yuwei Ren
Markus Nagel
Jorn W. T. Peters
Tijmen Blankevoort
MQ
25
81
0
19 Aug 2022
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
30
11
0
11 Aug 2022
Design of High-Throughput Mixed-Precision CNN Accelerators on FPGA
Cecilia Latotzke
Tim Ciesielski
T. Gemmeke
MQ
13
8
0
09 Aug 2022
Symmetry Regularization and Saturating Nonlinearity for Robust Quantization
Sein Park
Yeongsang Jang
Eunhyeok Park
MQ
26
2
0
31 Jul 2022
CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution
Chee Hong
Sungyong Baik
Heewon Kim
Seungjun Nah
Kyoung Mu Lee
SupR
MQ
31
32
0
21 Jul 2022
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
Zhikai Li
Qingyi Gu
MQ
57
96
0
04 Jul 2022
QReg: On Regularization Effects of Quantization
Mohammadhossein Askarihemmat
Reyhane Askari Hemmat
Alexander Hoffman
Ivan Lazarevich
Ehsan Saboori
Olivier Mastropietro
Sudhakar Sah
Yvon Savaria
J. David
MQ
41
5
0
24 Jun 2022
Fast Lossless Neural Compression with Integer-Only Discrete Flows
Siyu Wang
Jianfei Chen
Chongxuan Li
Jun Zhu
Bo Zhang
MQ
26
7
0
17 Jun 2022
Optimal Clipping and Magnitude-aware Differentiation for Improved Quantization-aware Training
Charbel Sakr
Steve Dai
Rangharajan Venkatesan
B. Zimmer
W. Dally
Brucek Khailany
MQ
27
41
0
13 Jun 2022
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
Z. Yao
Reza Yazdani Aminabadi
Minjia Zhang
Xiaoxia Wu
Conglong Li
Yuxiong He
VLM
MQ
73
448
0
04 Jun 2022
Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures
Yongji Wu
Matthew Lentz
Danyang Zhuo
Yao Lu
34
22
0
10 May 2022
Compact Model Training by Low-Rank Projection with Energy Transfer
K. Guo
Zhenquan Lin
Xiaofen Xing
Fang Liu
Xiangmin Xu
40
2
0
12 Apr 2022
SplitNets: Designing Neural Architectures for Efficient Distributed Computing on Head-Mounted Systems
Xin Dong
B. D. Salvo
Meng Li
Chiao Liu
Zhongnan Qu
H. T. Kung
Ziyun Li
3DGS
31
20
0
10 Apr 2022
FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation
Ahmad Shawahna
S. M. Sait
A. El-Maleh
Irfan Ahmad
MQ
20
7
0
22 Mar 2022
Minimum Variance Unbiased N:M Sparsity for the Neural Gradients
Brian Chmiel
Itay Hubara
Ron Banner
Daniel Soudry
23
10
0
21 Mar 2022
Compression of Generative Pre-trained Language Models via Quantization
Chaofan Tao
Lu Hou
Wei Zhang
Lifeng Shang
Xin Jiang
Qun Liu
Ping Luo
Ngai Wong
MQ
38
103
0
21 Mar 2022
QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization
Xiuying Wei
Ruihao Gong
Yuhang Li
Xianglong Liu
F. Yu
MQ
VLM
19
168
0
11 Mar 2022
Patch Similarity Aware Data-Free Quantization for Vision Transformers
Zhikai Li
Liping Ma
Mengjuan Chen
Junrui Xiao
Qingyi Gu
MQ
ViT
27
44
0
04 Mar 2022
Standard Deviation-Based Quantization for Deep Neural Networks
Amir Ardakani
A. Ardakani
B. Meyer
J. Clark
W. Gross
MQ
55
1
0
24 Feb 2022
Highly-Efficient Binary Neural Networks for Visual Place Recognition
Bruno Ferrarini
Michael Milford
Klaus D. McDonald-Maier
Shoaib Ehsan
21
7
0
24 Feb 2022
Post-Training Quantization for Cross-Platform Learned Image Compression
Dailan He
Zi Yang
Yuan-Hsin Chen
Qi Zhang
Hongwei Qin
Yan Wang
MQ
45
13
0
15 Feb 2022
Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment
Jemin Lee
Misun Yu
Yongin Kwon
Teaho Kim
MQ
30
17
0
10 Feb 2022
Energy awareness in low precision neural networks
Nurit Spingarn-Eliezer
Ron Banner
Elad Hoffer
Hilla Ben-Yaacov
T. Michaeli
41
0
0
06 Feb 2022
COIN++: Neural Compression Across Modalities
Emilien Dupont
H. Loya
Milad Alizadeh
Adam Goliñski
Yee Whye Teh
Arnaud Doucet
63
83
0
30 Jan 2022
Resource-efficient Deep Neural Networks for Automotive Radar Interference Mitigation
J. Rock
Wolfgang Roth
Máté Tóth
Paul Meissner
Franz Pernkopf
30
43
0
25 Jan 2022
Neural Network Quantization with AI Model Efficiency Toolkit (AIMET)
S. Siddegowda
Marios Fournarakis
Markus Nagel
Tijmen Blankevoort
Chirag I. Patel
Abhijit Khobare
MQ
14
32
0
20 Jan 2022
Implicit Neural Video Compression
Yunfan Zhang
T. V. Rozendaal
Johann Brehmer
Markus Nagel
Taco S. Cohen
49
57
0
21 Dec 2021
N3H-Core: Neuron-designed Neural Network Accelerator via FPGA-based Heterogeneous Computing Cores
Yu Gong
Zhihang Xu
Zhezhi He
Weifeng Zhang
Xiaobing Tu
Xiaoyao Liang
Li Jiang
33
13
0
15 Dec 2021
AdaSTE: An Adaptive Straight-Through Estimator to Train Binary Neural Networks
Huu Le
R. Høier
Che-Tsung Lin
Christopher Zach
55
17
0
06 Dec 2021
Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation
Zechun Liu
Kwang-Ting Cheng
Dong Huang
Eric P. Xing
Zhiqiang Shen
MQ
25
103
0
29 Nov 2021
Sharpness-aware Quantization for Deep Neural Networks
Jing Liu
Jianfei Cai
Bohan Zhuang
MQ
32
24
0
24 Nov 2021
Mesa: A Memory-saving Training Framework for Transformers
Zizheng Pan
Peng Chen
Haoyu He
Jing Liu
Jianfei Cai
Bohan Zhuang
31
20
0
22 Nov 2021
IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization
Mingliang Xu
Mingbao Lin
Gongrui Nan
Jianzhuang Liu
Baochang Zhang
Yonghong Tian
Rongrong Ji
MQ
51
71
0
17 Nov 2021
Arch-Net: Model Distillation for Architecture Agnostic Model Deployment
Weixin Xu
Zipeng Feng
Shuangkang Fang
Song Yuan
Yi Yang
Shuchang Zhou
MQ
30
1
0
01 Nov 2021
CHIP: CHannel Independence-based Pruning for Compact Neural Networks
Yang Sui
Miao Yin
Yi Xie
Huy Phan
S. Zonouz
Bo Yuan
VLM
37
129
0
26 Oct 2021
Haar Wavelet Feature Compression for Quantized Graph Convolutional Networks
Moshe Eliasof
Ben Bodner
Eran Treister
GNN
35
7
0
10 Oct 2021
Towards Efficient Post-training Quantization of Pre-trained Language Models
Haoli Bai
Lu Hou
Lifeng Shang
Xin Jiang
Irwin King
M. Lyu
MQ
82
47
0
30 Sep 2021
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
25
133
0
27 Sep 2021
iRNN: Integer-only Recurrent Neural Network
Eyyub Sari
Vanessa Courville
V. Nia
MQ
56
4
0
20 Sep 2021
2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both Adversarial Robustness and Efficiency
Yonggan Fu
Yang Katie Zhao
Qixuan Yu
Chaojian Li
Yingyan Lin
AAML
52
12
0
11 Sep 2021
Quantized Convolutional Neural Networks Through the Lens of Partial Differential Equations
Ido Ben-Yair
Gil Ben Shalom
Moshe Eliasof
Eran Treister
MQ
36
5
0
31 Aug 2021
Quantization of Generative Adversarial Networks for Efficient Inference: a Methodological Study
Pavel Andreev
Alexander Fritzler
Dmitry Vetrov
MQ
19
10
0
31 Aug 2021
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Amin Banitalebi-Dehkordi
Naveen Vedula
J. Pei
Fei Xia
Lanjun Wang
Yong Zhang
22
89
0
30 Aug 2021
A High-Performance Adaptive Quantization Approach for Edge CNN Applications
Hsu-Hsun Chin
R. Tsay
Hsin-I Wu
MQ
24
5
0
18 Jul 2021
Learned Token Pruning for Transformers
Sehoon Kim
Sheng Shen
D. Thorsley
A. Gholami
Woosuk Kwon
Joseph Hassoun
Kurt Keutzer
17
146
0
02 Jul 2021
A White Paper on Neural Network Quantization
Markus Nagel
Marios Fournarakis
Rana Ali Amjad
Yelysei Bondarenko
M. V. Baalen
Tijmen Blankevoort
MQ
36
510
0
15 Jun 2021
Quantization and Deployment of Deep Neural Networks on Microcontrollers
Pierre-Emmanuel Novac
G. B. Hacene
Alain Pegatoquet
Benoit Miramond
Vincent Gripon
MQ
25
116
0
27 May 2021
BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer
Haoping Bai
Mengsi Cao
Ping Huang
Jiulong Shan
MQ
25
34
0
19 May 2021
Previous
1
2
3
4
Next