ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.05877
  4. Cited By
Quantization and Training of Neural Networks for Efficient
  Integer-Arithmetic-Only Inference

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

15 December 2017
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
    MQ
ArXiv (abs)PDFHTML

Papers citing "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"

50 / 1,298 papers shown
Title
VecQ: Minimal Loss DNN Model Compression With Vectorized Weight
  Quantization
VecQ: Minimal Loss DNN Model Compression With Vectorized Weight Quantization
Cheng Gong
Yao Chen
Ye Lu
Tao Li
Cong Hao
Deming Chen
MQ
51
45
0
18 May 2020
MicroNet for Efficient Language Modeling
MicroNet for Efficient Language Modeling
Zhongxia Yan
Hanrui Wang
Demi Guo
Song Han
62
8
0
16 May 2020
A flexible, extensible software framework for model compression based on
  the LC algorithm
A flexible, extensible software framework for model compression based on the LC algorithm
Yerlan Idelbayev
Miguel Á. Carreira-Perpiñán
19
9
0
15 May 2020
Bayesian Bits: Unifying Quantization and Pruning
Bayesian Bits: Unifying Quantization and Pruning
M. V. Baalen
Christos Louizos
Markus Nagel
Rana Ali Amjad
Ying Wang
Tijmen Blankevoort
Max Welling
MQ
95
116
0
14 May 2020
Streaming keyword spotting on mobile devices
Streaming keyword spotting on mobile devices
Oleg Rybakov
Natasha Kononenko
Niranjan A. Subrahmanya
Mirkó Visontai
Stella Laurenzo
AI4TS
124
112
0
14 May 2020
Data-Free Network Quantization With Adversarial Knowledge Distillation
Data-Free Network Quantization With Adversarial Knowledge Distillation
Yoojin Choi
Jihwan P. Choi
Mostafa El-Khamy
Jungwon Lee
MQ
76
121
0
08 May 2020
Efficient Exact Verification of Binarized Neural Networks
Efficient Exact Verification of Binarized Neural Networks
Kai Jia
Martin Rinard
AAMLMQ
48
59
0
07 May 2020
AutoScale: Optimizing Energy Efficiency of End-to-End Edge Inference
  under Stochastic Variance
AutoScale: Optimizing Energy Efficiency of End-to-End Edge Inference under Stochastic Variance
Young Geun Kim
Carole-Jean Wu
35
3
0
06 May 2020
MobileDets: Searching for Object Detection Architectures for Mobile
  Accelerators
MobileDets: Searching for Object Detection Architectures for Mobile Accelerators
Yunyang Xiong
Hanxiao Liu
Suyog Gupta
Berkin Akin
Gabriel Bender
Yongzhe Wang
Pieter-Jan Kindermans
Mingxing Tan
Vikas Singh
Bo Chen
ObjD
45
133
0
30 Apr 2020
Do We Need Fully Connected Output Layers in Convolutional Networks?
Do We Need Fully Connected Output Layers in Convolutional Networks?
Zhongchao Qian
Tyler L. Hayes
Kushal Kafle
Christopher Kanan
49
9
0
28 Apr 2020
Compact retail shelf segmentation for mobile deployment
Compact retail shelf segmentation for mobile deployment
Pratyush Kumar
Muktabh Mayank Srivastava
28
0
0
27 Apr 2020
A scalable and efficient convolutional neural network accelerator using
  HLS for a System on Chip design
A scalable and efficient convolutional neural network accelerator using HLS for a System on Chip design
K. Bjerge
J. Schougaard
Daniel Ejnar Larsen
40
1
0
27 Apr 2020
Deploying Image Deblurring across Mobile Devices: A Perspective of
  Quality and Latency
Deploying Image Deblurring across Mobile Devices: A Perspective of Quality and Latency
Cheng-Ming Chiang
Yu-Wen Tseng
Yu-Syuan Xu
Hsien-Kai Kuo
Yi-Min Tsai
...
Chia-Lin Yu
B. Shen
Kloze Kao
Chia-Ming Cheng
Hung-Jen Chen
75
22
0
27 Apr 2020
Q-EEGNet: an Energy-Efficient 8-bit Quantized Parallel EEGNet
  Implementation for Edge Motor-Imagery Brain--Machine Interfaces
Q-EEGNet: an Energy-Efficient 8-bit Quantized Parallel EEGNet Implementation for Edge Motor-Imagery Brain--Machine Interfaces
Tibor Schneider
Xiaying Wang
Michael Hersche
Lukas Cavigelli
Luca Benini
57
24
0
24 Apr 2020
Up or Down? Adaptive Rounding for Post-Training Quantization
Up or Down? Adaptive Rounding for Post-Training Quantization
Markus Nagel
Rana Ali Amjad
M. V. Baalen
Christos Louizos
Tijmen Blankevoort
MQ
107
590
0
22 Apr 2020
A Data and Compute Efficient Design for Limited-Resources Deep Learning
A Data and Compute Efficient Design for Limited-Resources Deep Learning
Mirgahney Mohamed
Gabriele Cesa
Taco S. Cohen
Max Welling
MedIm
93
18
0
21 Apr 2020
Integer Quantization for Deep Learning Inference: Principles and
  Empirical Evaluation
Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation
Hao Wu
Patrick Judd
Xiaojie Zhang
Mikhail Isaev
Paulius Micikevicius
MQ
105
362
0
20 Apr 2020
LSQ+: Improving low-bit quantization through learnable offsets and
  better initialization
LSQ+: Improving low-bit quantization through learnable offsets and better initialization
Yash Bhalgat
Jinwon Lee
Markus Nagel
Tijmen Blankevoort
Nojun Kwak
MQ
67
223
0
20 Apr 2020
Training with Quantization Noise for Extreme Model Compression
Training with Quantization Noise for Extreme Model Compression
Angela Fan
Pierre Stock
Benjamin Graham
Edouard Grave
Remi Gribonval
Hervé Jégou
Armand Joulin
MQ
113
246
0
15 Apr 2020
Q-CapsNets: A Specialized Framework for Quantizing Capsule Networks
Q-CapsNets: A Specialized Framework for Quantizing Capsule Networks
Alberto Marchisio
Beatrice Bussolino
Alessio Colucci
Maurizio Martina
Guido Masera
Mohamed Bennai
3DPC
47
16
0
15 Apr 2020
DarkneTZ: Towards Model Privacy at the Edge using Trusted Execution
  Environments
DarkneTZ: Towards Model Privacy at the Edge using Trusted Execution Environments
Fan Mo
Ali Shahin Shamsabadi
Kleomenis Katevas
Soteris Demetriou
Ilias Leontiadis
Andrea Cavallaro
Hamed Haddadi
FedML
68
183
0
12 Apr 2020
From Quantized DNNs to Quantizable DNNs
From Quantized DNNs to Quantizable DNNs
Kunyuan Du
Ya Zhang
Haibing Guan
MQ
56
3
0
11 Apr 2020
Sequence Model Design for Code Completion in the Modern IDE
Sequence Model Design for Code Completion in the Modern IDE
Gareth Ari Aye
Gail E. Kaiser
60
30
0
10 Apr 2020
FALCON: Honest-Majority Maliciously Secure Framework for Private Deep
  Learning
FALCON: Honest-Majority Maliciously Secure Framework for Private Deep Learning
Sameer Wagh
Shruti Tople
Fabrice Benhamouda
E. Kushilevitz
Prateek Mittal
T. Rabin
FedML
112
304
0
05 Apr 2020
Binary Neural Networks: A Survey
Binary Neural Networks: A Survey
Haotong Qin
Ruihao Gong
Xianglong Liu
Xiao Bai
Jingkuan Song
N. Sebe
MQ
145
476
0
31 Mar 2020
Organ Segmentation From Full-size CT Images Using Memory-Efficient FCN
Organ Segmentation From Full-size CT Images Using Memory-Efficient FCN
Chenglong Wang
M. Oda
K. Mori
MedIm
26
3
0
24 Mar 2020
Efficient Crowd Counting via Structured Knowledge Transfer
Efficient Crowd Counting via Structured Knowledge Transfer
Lingbo Liu
Jiaqi Chen
Hefeng Wu
Tianshui Chen
Guanbin Li
Liang Lin
98
65
0
23 Mar 2020
LANCE: Efficient Low-Precision Quantized Winograd Convolution for Neural
  Networks Based on Graphics Processing Units
LANCE: Efficient Low-Precision Quantized Winograd Convolution for Neural Networks Based on Graphics Processing Units
Guangli Li
Lei Liu
Xueying Wang
Xiu Ma
Xiaobing Feng
MQ
50
18
0
19 Mar 2020
Efficient Bitwidth Search for Practical Mixed Precision Neural Network
Efficient Bitwidth Search for Practical Mixed Precision Neural Network
Yuhang Li
Wei Wang
Haoli Bai
Ruihao Gong
Xin Dong
F. Yu
MQ
54
21
0
17 Mar 2020
Resolution Adaptive Networks for Efficient Inference
Resolution Adaptive Networks for Efficient Inference
Le Yang
Yizeng Han
Xi Chen
Shiji Song
Jifeng Dai
Gao Huang
119
219
0
16 Mar 2020
BiDet: An Efficient Binarized Object Detector
BiDet: An Efficient Binarized Object Detector
Ziwei Wang
Ziyi Wu
Jiwen Lu
Jie Zhou
MQ
123
66
0
09 Mar 2020
Generative Low-bitwidth Data Free Quantization
Generative Low-bitwidth Data Free Quantization
Shoukai Xu
Haokun Li
Bohan Zhuang
Jing Liu
Jingyun Liang
Chuangrun Liang
Mingkui Tan
MQ
79
127
0
07 Mar 2020
Learned Threshold Pruning
Learned Threshold Pruning
K. Azarian
Yash Bhalgat
Jinwon Lee
Tijmen Blankevoort
MQ
86
38
0
28 Feb 2020
DC-BERT: Decoupling Question and Document for Efficient Contextual
  Encoding
DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding
Yuyu Zhang
Ping Nie
Xiubo Geng
Arun Ramamurthy
Le Song
Daxin Jiang
81
61
0
28 Feb 2020
Small-Footprint Open-Vocabulary Keyword Spotting with Quantized LSTM
  Networks
Small-Footprint Open-Vocabulary Keyword Spotting with Quantized LSTM Networks
Théodore Bluche
Maël Primet
Thibault Gisselbrecht
ObjDMQ
67
24
0
25 Feb 2020
Searching for Winograd-aware Quantized Networks
Searching for Winograd-aware Quantized Networks
Javier Fernandez-Marques
P. Whatmough
Andrew Mundy
Matthew Mattina
MQ
76
40
0
25 Feb 2020
TFApprox: Towards a Fast Emulation of DNN Approximate Hardware
  Accelerators on GPU
TFApprox: Towards a Fast Emulation of DNN Approximate Hardware Accelerators on GPU
Filip Vaverka
Vojtěch Mrázek
Z. Vašíček
Lukás Sekanina
58
34
0
21 Feb 2020
Robust Quantization: One Model to Rule Them All
Robust Quantization: One Model to Rule Them All
Moran Shkolnik
Brian Chmiel
Ron Banner
Gil Shomron
Yury Nahshan
A. Bronstein
U. Weiser
OODMQ
104
76
0
18 Feb 2020
Gradient $\ell_1$ Regularization for Quantization Robustness
Gradient ℓ1\ell_1ℓ1​ Regularization for Quantization Robustness
Milad Alizadeh
Arash Behboodi
M. V. Baalen
Christos Louizos
Tijmen Blankevoort
Max Welling
MQ
58
8
0
18 Feb 2020
Taurus: A Data Plane Architecture for Per-Packet ML
Taurus: A Data Plane Architecture for Per-Packet ML
Tushar Swamy
Alexander Rucker
M. Shahbaz
Ishan Gaur
K. Olukotun
56
87
0
12 Feb 2020
Machine Learning in Python: Main developments and technology trends in
  data science, machine learning, and artificial intelligence
Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence
S. Raschka
Joshua Patterson
Corey J. Nolet
AI4CE
113
505
0
12 Feb 2020
A Spike in Performance: Training Hybrid-Spiking Neural Networks with
  Quantized Activation Functions
A Spike in Performance: Training Hybrid-Spiking Neural Networks with Quantized Activation Functions
Aaron R. Voelker
Daniel Rasmussen
C. Eliasmith
80
17
0
10 Feb 2020
Understanding and Improving Knowledge Distillation
Understanding and Improving Knowledge Distillation
Jiaxi Tang
Rakesh Shivanna
Zhe Zhao
Dong Lin
Anima Singh
Ed H. Chi
Sagar Jain
103
134
0
10 Feb 2020
Post-Training Piecewise Linear Quantization for Deep Neural Networks
Post-Training Piecewise Linear Quantization for Deep Neural Networks
Jun Fang
Ali Shafiee
Hamzah Abdel-Aziz
D. Thorsley
Georgios Georgiadis
Joseph Hassoun
MQ
108
148
0
31 Jan 2020
Boosted and Differentially Private Ensembles of Decision Trees
Boosted and Differentially Private Ensembles of Decision Trees
Richard Nock
Wilko Henecka
54
2
0
26 Jan 2020
The Two-Pass Softmax Algorithm
The Two-Pass Softmax Algorithm
Marat Dukhan
Artsiom Ablavatski
TPM
31
8
0
13 Jan 2020
Least squares binary quantization of neural networks
Least squares binary quantization of neural networks
Hadi Pouransari
Zhucheng Tu
Oncel Tuzel
MQ
82
32
0
09 Jan 2020
Resource-Efficient Neural Networks for Embedded Systems
Resource-Efficient Neural Networks for Embedded Systems
Wolfgang Roth
Günther Schindler
Lukas Pfeifenberger
Robert Peharz
Sebastian Tschiatschek
Holger Fröning
Franz Pernkopf
Zoubin Ghahramani
88
51
0
07 Jan 2020
Sparse Weight Activation Training
Sparse Weight Activation Training
Md Aamir Raihan
Tor M. Aamodt
155
73
0
07 Jan 2020
Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference
Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference
Jianghao Shen
Y. Fu
Yue Wang
Pengfei Xu
Zhangyang Wang
Yingyan Lin
MQ
60
45
0
03 Jan 2020
Previous
123...2223242526
Next