Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1809.04191
Cited By
Discovering Low-Precision Networks Close to Full-Precision Networks for Efficient Embedded Inference
11 September 2018
J. McKinstry
S. K. Esser
R. Appuswamy
Deepika Bablani
John V. Arthur
Izzet B. Yildiz
D. Modha
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Discovering Low-Precision Networks Close to Full-Precision Networks for Efficient Embedded Inference"
19 / 19 papers shown
Title
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Chong Yu
Tao Chen
Zhongxue Gan
Jiayuan Fan
MQ
ViT
30
23
0
18 May 2023
QFT: Post-training quantization via fast joint finetuning of all degrees of freedom
Alexander Finkelstein
Ella Fuchs
Idan Tal
Mark Grobman
Niv Vosco
Eldad Meller
MQ
29
6
0
05 Dec 2022
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models
Xiuying Wei
Yunchen Zhang
Xiangguo Zhang
Ruihao Gong
Shanghang Zhang
Qi Zhang
F. Yu
Xianglong Liu
MQ
36
145
0
27 Sep 2022
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
25
11
0
11 Aug 2022
Elastic-Link for Binarized Neural Network
Jie Hu
Ziheng Wu
Vince Tan
Zhilin Lu
Mengze Zeng
Enhua Wu
MQ
30
6
0
19 Dec 2021
Quantization-Guided Training for Compact TinyML Models
Sedigh Ghamari
Koray Ozcan
Thu Dinh
A. Melnikov
Juan Carvajal
Jan Ernst
S. Chai
MQ
21
16
0
10 Mar 2021
Dynamic Precision Analog Computing for Neural Networks
Sahaj Garg
Joe Lou
Anirudh Jain
Mitchell Nahmias
45
33
0
12 Feb 2021
VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference
Steve Dai
Rangharajan Venkatesan
Haoxing Ren
B. Zimmer
W. Dally
Brucek Khailany
MQ
27
67
0
08 Feb 2021
Sparse Weight Activation Training
Md Aamir Raihan
Tor M. Aamodt
34
73
0
07 Jan 2020
Towards Unified INT8 Training for Convolutional Neural Network
Feng Zhu
Ruihao Gong
F. Yu
Xianglong Liu
Yanfei Wang
Zhelong Li
Xiuqi Yang
Junjie Yan
MQ
35
150
0
29 Dec 2019
Training DNN IoT Applications for Deployment On Analog NVM Crossbars
F. García-Redondo
Shidhartha Das
G. Rosendale
19
5
0
30 Oct 2019
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks
Ruihao Gong
Xianglong Liu
Shenghu Jiang
Tian-Hao Li
Peng Hu
Jiazhen Lin
F. Yu
Junjie Yan
MQ
32
446
0
14 Aug 2019
3D-aCortex: An Ultra-Compact Energy-Efficient Neurocomputing Platform Based on Commercial 3D-NAND Flash Memories
Mohammad Bavandpour
Shubham Sahay
M. Mahmoodi
D. Strukov
16
29
0
07 Aug 2019
AutoQ: Automated Kernel-Wise Neural Network Quantization
Qian Lou
Feng Guo
Lantao Liu
Minje Kim
Lei Jiang
MQ
21
97
0
15 Feb 2019
Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization
Eldad Meller
Alexander Finkelstein
Uri Almog
Mark Grobman
MQ
21
85
0
05 Feb 2019
Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
Ritchie Zhao
Yuwei Hu
Jordan Dotzel
Christopher De Sa
Zhiru Zhang
OODD
MQ
38
305
0
28 Jan 2019
Accumulation Bit-Width Scaling For Ultra-Low Precision Training Of Deep Networks
Charbel Sakr
Naigang Wang
Chia-Yu Chen
Jungwook Choi
A. Agrawal
Naresh R Shanbhag
K. Gopalakrishnan
MQ
22
34
0
19 Jan 2019
Post-training 4-bit quantization of convolution networks for rapid-deployment
Ron Banner
Yury Nahshan
Elad Hoffer
Daniel Soudry
MQ
19
93
0
02 Oct 2018
Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights
Aojun Zhou
Anbang Yao
Yiwen Guo
Lin Xu
Yurong Chen
MQ
337
1,049
0
10 Feb 2017
1