Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.05877
Cited By
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
15 December 2017
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"
50 / 1,298 papers shown
Title
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers
Junting Pan
Adrian Bulat
Fuwen Tan
Xiatian Zhu
Łukasz Dudziak
Hongsheng Li
Georgios Tzimiropoulos
Brais Martínez
ViT
98
197
0
06 May 2022
Attention-based Knowledge Distillation in Multi-attention Tasks: The Impact of a DCT-driven Loss
Alejandro López-Cifuentes
Marcos Escudero-Viñolo
Jesús Bescós
Juan C. Sanmiguel
61
1
0
04 May 2022
Compact Neural Networks via Stacking Designed Basic Units
Weichao Lan
Y. Cheung
Juyong Jiang
58
0
0
03 May 2022
Towards Feature Distribution Alignment and Diversity Enhancement for Data-Free Quantization
Yangcheng Gao
Zhao Zhang
Richang Hong
Haijun Zhang
Jicong Fan
Shuicheng Yan
MQ
60
10
0
30 Apr 2022
A Closer Look at Branch Classifiers of Multi-exit Architectures
Shaohui Lin
Bo Ji
Rongrong Ji
Angela Yao
59
4
0
28 Apr 2022
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
Han Cai
Ji Lin
Chengyue Wu
Zhijian Liu
Haotian Tang
Hanrui Wang
Ligeng Zhu
Song Han
116
116
0
25 Apr 2022
A Tale of Two Models: Constructing Evasive Attacks on Edge Models
Wei Hao
Aahil Awatramani
Jia-Bin Hu
Chengzhi Mao
Pin-Chun Chen
Eyal Cidon
Asaf Cidon
Junfeng Yang
AAML
90
4
0
22 Apr 2022
Privacy-preserving Social Distance Monitoring on Microcontrollers with Low-Resolution Infrared Sensors and CNNs
Chen Xie
Francesco Daghero
Yukai Chen
Marco Castellano
Luca Gandolfi
A. Calimera
Enrico Macii
Massimo Poncino
Daniele Jahier Pagliari
69
4
0
22 Apr 2022
Energy-efficient and Privacy-aware Social Distance Monitoring with Low-resolution Infrared Sensors and Adaptive Inference
Chen Xie
Daniele Jahier Pagliari
A. Calimera
26
2
0
22 Apr 2022
Multiply-and-Fire (MNF): An Event-driven Sparse Neural Network Accelerator
Miao Yu
Tingting Xiang
Venkata Pavan Kumar Miriyala
Trevor E. Carlson
41
1
0
20 Apr 2022
TangoBERT: Reducing Inference Cost by using Cascaded Architecture
Jonathan Mamou
Oren Pereg
Moshe Wasserblat
Roy Schwartz
44
12
0
13 Apr 2022
E^2TAD: An Energy-Efficient Tracking-based Action Detector
Xin Hu
Zhenyu Wu
Haoyuan Miao
Siqi Fan
Taiyu Long
...
Pengcheng Pi
Yi Wu
Zhou Ren
Zhangyang Wang
G. Hua
89
2
0
09 Apr 2022
Characterizing and Understanding the Behavior of Quantized Models for Reliable Deployment
Qiang Hu
Yuejun Guo
Maxime Cordy
Xiaofei Xie
Wei Ma
Mike Papadakis
Yves Le Traon
MQ
73
2
0
08 Apr 2022
LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification
Sharath Girish
Kamal Gupta
Saurabh Singh
Abhinav Shrivastava
98
11
0
06 Apr 2022
It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher
Kanghyun Choi
Hye Yoon Lee
Deokki Hong
Joonsang Yu
Noseong Park
Youngsok Kim
Jinho Lee
MQ
104
33
0
31 Mar 2022
L^3U-net: Low-Latency Lightweight U-net Based Image Segmentation Model for Parallel CNN Processors
O. E. Okman
Mehmet Gorkem Ulkar
Gulnur Selda Uyanik
SSeg
39
4
0
30 Mar 2022
4-bit Conformer with Native Quantization Aware Training for Speech Recognition
Shaojin Ding
Phoenix Meadowlark
Yanzhang He
Lukasz Lew
Shivani Agrawal
Oleg Rybakov
MQ
94
36
0
29 Mar 2022
To Fold or Not to Fold: a Necessary and Sufficient Condition on Batch-Normalization Layers Folding
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
66
7
0
28 Mar 2022
REx: Data-Free Residual Quantization Error Expansion
Edouard Yvinec
Arnaud Dapgony
Matthieu Cord
Kévin Bailly
MQ
103
8
0
28 Mar 2022
MKQ-BERT: Quantized BERT with 4-bits Weights and Activations
Hanlin Tang
Xipeng Zhang
Kai Liu
Jianchen Zhu
Zhanhui Kang
VLM
MQ
51
15
0
25 Mar 2022
Efficient Hardware Acceleration of Sparsely Active Convolutional Spiking Neural Networks
Jan Sommer
M. A. Özkan
Oliver Keszocze
Jürgen Teich
31
18
0
23 Mar 2022
FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation
Ahmad Shawahna
S. M. Sait
A. El-Maleh
Irfan Ahmad
MQ
63
7
0
22 Mar 2022
Training Quantised Neural Networks with STE Variants: the Additive Noise Annealing Algorithm
Matteo Spallanzani
G. P. Leonardi
Luca Benini
54
3
0
21 Mar 2022
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization
Zheng Li
Zijian Wang
Ming Tan
Ramesh Nallapati
Parminder Bhatia
Andrew O. Arnold
Bing Xiang
Dan Roth
MQ
78
44
0
21 Mar 2022
Online Continual Learning for Embedded Devices
Tyler L. Hayes
Christopher Kanan
CLL
96
56
0
21 Mar 2022
Delta Distillation for Efficient Video Processing
A. Habibian
H. Yahia
Davide Abati
E. Gavves
Fatih Porikli
36
12
0
17 Mar 2022
SC2 Benchmark: Supervised Compression for Split Computing
Yoshitomo Matsubara
Ruihan Yang
Marco Levorato
Stephan Mandt
134
20
0
16 Mar 2022
Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey
Giorgos Armeniakos
Georgios Zervakis
Dimitrios Soudris
J. Henkel
284
98
0
16 Mar 2022
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models
Eldar Kurtic
Daniel Fernando Campos
Tuan Nguyen
Elias Frantar
Mark Kurtz
Ben Fineran
Michael Goin
Dan Alistarh
VLM
MQ
MedIm
122
127
0
14 Mar 2022
A Mixed Quantization Network for Computationally Efficient Mobile Inverse Tone Mapping
Juan Borrego-Carazo
Mete Ozay
Frederik Laboyrie
Paul Wisbey
MQ
52
0
0
12 Mar 2022
QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization
Xiuying Wei
Ruihao Gong
Yuhang Li
Xianglong Liu
F. Yu
MQ
VLM
102
178
0
11 Mar 2022
CoCoFL: Communication- and Computation-Aware Federated Learning via Partial NN Freezing and Quantization
Kilian Pfeiffer
Martin Rapp
R. Khalili
J. Henkel
FedML
118
11
0
10 Mar 2022
Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks
Mingliang Xu
Mingbao Lin
Xunchao Li
Ke Li
Yunhang Shen
Yong Li
Yongjian Wu
Rongrong Ji
MQ
71
27
0
08 Mar 2022
YONO: Modeling Multiple Heterogeneous Neural Networks on Microcontrollers
Young D. Kwon
Jagmohan Chauhan
Cecilia Mascolo
70
13
0
08 Mar 2022
ZippyPoint: Fast Interest Point Detection, Description, and Matching through Mixed Precision Discretization
Menelaos Kanakis
S. Maurer
Matteo Spallanzani
Ajad Chhatkuli
Luc Van Gool
3DPC
86
14
0
07 Mar 2022
Dynamic ConvNets on Tiny Devices via Nested Sparsity
Matteo Grimaldi
Luca Mocerino
A. Cipolletta
A. Calimera
106
6
0
07 Mar 2022
Structured Pruning is All You Need for Pruning CNNs at Initialization
Yaohui Cai
Weizhe Hua
Hongzheng Chen
G. E. Suh
Christopher De Sa
Zhiru Zhang
CVBM
96
15
0
04 Mar 2022
Patch Similarity Aware Data-Free Quantization for Vision Transformers
Zhikai Li
Liping Ma
Mengjuan Chen
Junrui Xiao
Qingyi Gu
MQ
ViT
113
46
0
04 Mar 2022
DCT-Former: Efficient Self-Attention with Discrete Cosine Transform
Carmelo Scribano
Giorgia Franchini
M. Prato
Marko Bertogna
66
26
0
02 Mar 2022
SEA: Bridging the Gap Between One- and Two-stage Detector Distillation via SEmantic-aware Alignment
Yixin Chen
Zhuotao Tian
Pengguang Chen
Shu Liu
Jiaya Jia
ObjD
28
1
0
02 Mar 2022
Comprehensive Analysis of the Object Detection Pipeline on UAVs
Leon Amadeus Varga
Sebastian Koch
A. Zell
42
5
0
01 Mar 2022
Multi-task Learning Approach for Modulation and Wireless Signal Classification for 5G and Beyond: Edge Deployment via Model Compression
Anu Jagannath
Jithin Jagannath
62
28
0
26 Feb 2022
Enabling On-Device Smartphone GPU based Training: Lessons Learned
Anish Das
Young D. Kwon
Jagmohan Chauhan
Cecilia Mascolo
3DH
81
11
0
21 Feb 2022
Post-Training Quantization for Cross-Platform Learned Image Compression
Dailan He
Zi Yang
Yuan-Hsin Chen
Qi Zhang
Hongwei Qin
Yan Wang
MQ
79
13
0
15 Feb 2022
A Survey on Model Compression and Acceleration for Pretrained Language Models
Canwen Xu
Julian McAuley
108
61
0
15 Feb 2022
Benchmarking of DL Libraries and Models on Mobile Devices
Qiyang Zhang
Xiang Li
Xiangying Che
Xiao Ma
Ao Zhou
Mengwei Xu
Shangguang Wang
Yudong Han
Xuanzhe Liu
82
49
0
14 Feb 2022
SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation
Cong Guo
Yuxian Qiu
Jingwen Leng
Xiaotian Gao
Chen Zhang
Yunxin Liu
Fan Yang
Yuhao Zhu
Minyi Guo
MQ
124
75
0
14 Feb 2022
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
Qing Jin
Jian Ren
Richard Zhuang
Sumant Hanumante
Zhengang Li
Zhiyu Chen
Yanzhi Wang
Kai-Min Yang
Sergey Tulyakov
MQ
97
50
0
10 Feb 2022
Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment
Jemin Lee
Misun Yu
Yongin Kwon
Teaho Kim
MQ
67
17
0
10 Feb 2022
pNLP-Mixer: an Efficient all-MLP Architecture for Language
Francesco Fusco
Damian Pascual
Peter W. J. Staar
Diego Antognini
96
30
0
09 Feb 2022
Previous
1
2
3
...
14
15
16
...
24
25
26
Next