ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.05877
  4. Cited By
Quantization and Training of Neural Networks for Efficient
  Integer-Arithmetic-Only Inference

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

15 December 2017
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
    MQ
ArXiv (abs)PDFHTML

Papers citing "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"

50 / 1,298 papers shown
Title
Hybrid and Non-Uniform quantization methods using retro synthesis data
  for efficient inference
Hybrid and Non-Uniform quantization methods using retro synthesis data for efficient inference
Gvsl Tej Pratap
R. Kumar
MQ
59
1
0
26 Dec 2020
Low-latency Perception in Off-Road Dynamical Low Visibility Environments
Low-latency Perception in Off-Road Dynamical Low Visibility Environments
Nelson Alves Ferreira Neto
Marco Ruiz
M. Reis
Tiago Cajahyba
David F. N. Oliveira
Ana Barreto
Eduardo F. Simas Filho
Wagner Luiz Alves de Oliveira
L. Schnitman
Roberto L. S. Monteiro
29
10
0
23 Dec 2020
Adaptive Precision Training for Resource Constrained Devices
Adaptive Precision Training for Resource Constrained Devices
Tian Huang
Yaoyu Zhang
Qiufeng Wang
65
5
0
23 Dec 2020
Hardware and Software Optimizations for Accelerating Deep Neural
  Networks: Survey of Current Trends, Challenges, and the Road Ahead
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead
Maurizio Capra
Beatrice Bussolino
Alberto Marchisio
Guido Masera
Maurizio Martina
Mohamed Bennai
BDL
138
147
0
21 Dec 2020
Efficient CNN-LSTM based Image Captioning using Neural Network
  Compression
Efficient CNN-LSTM based Image Captioning using Neural Network Compression
Harshit Rampal
Aman Mohanty
VLM
46
4
0
17 Dec 2020
Revisiting Linformer with a modified self-attention with linear
  complexity
Revisiting Linformer with a modified self-attention with linear complexity
Madhusudan Verma
51
8
0
16 Dec 2020
Exploring Neural Networks Quantization via Layer-Wise Quantization
  Analysis
Exploring Neural Networks Quantization via Layer-Wise Quantization Analysis
Shachar Gluska
Mark Grobman
MQ
54
5
0
15 Dec 2020
Scalable Verification of Quantized Neural Networks (Technical Report)
Scalable Verification of Quantized Neural Networks (Technical Report)
T. Henzinger
Mathias Lechner
Dorde Zikelic
MQ
64
34
0
15 Dec 2020
Demystifying Deep Neural Networks Through Interpretation: A Survey
Demystifying Deep Neural Networks Through Interpretation: A Survey
Giang Dao
Minwoo Lee
FaMLFAtt
66
1
0
13 Dec 2020
Privacy-Preserving Spam Filtering using Functional Encryption
Privacy-Preserving Spam Filtering using Functional Encryption
Sicong Wang
Naveen Karunanayake
Tham Nguyen
Suranga Seneviratne
31
2
0
08 Dec 2020
An Once-for-All Budgeted Pruning Framework for ConvNets Considering
  Input Resolution
An Once-for-All Budgeted Pruning Framework for ConvNets Considering Input Resolution
Wenyu Sun
Jian Cao
Pengtao Xu
Xiangcheng Liu
Pu Li
36
0
0
02 Dec 2020
Solvable Model for Inheriting the Regularization through Knowledge
  Distillation
Solvable Model for Inheriting the Regularization through Knowledge Distillation
Luca Saglietti
Lenka Zdeborová
53
20
0
01 Dec 2020
A Tiny CNN Architecture for Medical Face Mask Detection for
  Resource-Constrained Endpoints
A Tiny CNN Architecture for Medical Face Mask Detection for Resource-Constrained Endpoints
P. Mohan
A. Paul
Abhay Chirania
CVBM
64
49
0
30 Nov 2020
KD-Lib: A PyTorch library for Knowledge Distillation, Pruning and
  Quantization
KD-Lib: A PyTorch library for Knowledge Distillation, Pruning and Quantization
Het Shah
Avishree Khare
Neelay Shah
Khizir Siddiqui
MQ
45
6
0
30 Nov 2020
Robust Ultra-wideband Range Error Mitigation with Deep Learning at the
  Edge
Robust Ultra-wideband Range Error Mitigation with Deep Learning at the Edge
Simone Angarano
Vittorio Mazzia
Francesco Salvetti
Giovanni Fantin
Marcello Chiaberge
139
44
0
30 Nov 2020
FactorizeNet: Progressive Depth Factorization for Efficient Network
  Architecture Exploration Under Quantization Constraints
FactorizeNet: Progressive Depth Factorization for Efficient Network Architecture Exploration Under Quantization Constraints
S. Yun
A. Wong
MQ
29
2
0
30 Nov 2020
Where Should We Begin? A Low-Level Exploration of Weight Initialization
  Impact on Quantized Behaviour of Deep Neural Networks
Where Should We Begin? A Low-Level Exploration of Weight Initialization Impact on Quantized Behaviour of Deep Neural Networks
S. Yun
A. Wong
MQ
46
4
0
30 Nov 2020
Bringing AI To Edge: From Deep Learning's Perspective
Bringing AI To Edge: From Deep Learning's Perspective
Di Liu
Hao Kong
Xiangzhong Luo
Weichen Liu
Ravi Subramaniam
116
124
0
25 Nov 2020
Auto Graph Encoder-Decoder for Neural Network Pruning
Auto Graph Encoder-Decoder for Neural Network Pruning
Sixing Yu
Arya Mazaheri
Ali Jannesari
GNN
81
40
0
25 Nov 2020
HAWQV3: Dyadic Neural Network Quantization
HAWQV3: Dyadic Neural Network Quantization
Z. Yao
Zhen Dong
Zhangcheng Zheng
A. Gholami
Jiali Yu
...
Leyuan Wang
Qijing Huang
Yida Wang
Michael W. Mahoney
Kurt Keutzer
MQ
128
87
0
20 Nov 2020
Empirical Evaluation of Deep Learning Model Compression Techniques on
  the WaveNet Vocoder
Empirical Evaluation of Deep Learning Model Compression Techniques on the WaveNet Vocoder
Sam Davis
Giuseppe Coccia
Sam Gooch
Julian Mack
38
0
0
20 Nov 2020
Layer-Wise Data-Free CNN Compression
Layer-Wise Data-Free CNN Compression
Maxwell Horton
Yanzi Jin
Ali Farhadi
Mohammad Rastegari
MQ
67
17
0
18 Nov 2020
Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural
  Networks
Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural Networks
Jun Nishikawa
Ryoji Ikegaya
MQ
34
1
0
13 Nov 2020
ATCN: Resource-Efficient Processing of Time Series on Edge
ATCN: Resource-Efficient Processing of Time Series on Edge
Mohammadreza Baharani
Hamed Tabkhi
AI4TS
81
1
0
10 Nov 2020
Neural Network Compression Via Sparse Optimization
Neural Network Compression Via Sparse Optimization
Tianyi Chen
Bo Ji
Yixin Shi
Tianyu Ding
Biyi Fang
Sheng Yi
Xiao Tu
84
16
0
10 Nov 2020
FRILL: A Non-Semantic Speech Embedding for Mobile Devices
FRILL: A Non-Semantic Speech Embedding for Mobile Devices
J. Peplinski
Joel Shor
Sachin P. Joglekar
Jake Garrison
Shwetak N. Patel
68
24
0
09 Nov 2020
PAMS: Quantized Super-Resolution via Parameterized Max Scale
PAMS: Quantized Super-Resolution via Parameterized Max Scale
Huixia Li
Chenqian Yan
Shaohui Lin
Xiawu Zheng
Yuchao Li
Baochang Zhang
Fan Yang
Rongrong Ji
MQ
76
86
0
09 Nov 2020
ReFloat: Low-Cost Floating-Point Processing in ReRAM for Accelerating
  Iterative Linear Solvers
ReFloat: Low-Cost Floating-Point Processing in ReRAM for Accelerating Iterative Linear Solvers
Linghao Song
Fan Chen
Xuehai Qian
Hai Li
Yiran Chen
62
6
0
06 Nov 2020
Paralinguistic Privacy Protection at the Edge
Paralinguistic Privacy Protection at the Edge
Ranya Aloufi
Hamed Haddadi
David E. Boyle
66
14
0
04 Nov 2020
Methods for Pruning Deep Neural Networks
Methods for Pruning Deep Neural Networks
S. Vadera
Salem Ameen
3DPC
76
131
0
31 Oct 2020
Visually Guided Balloon Popping with an Autonomous MAV at MBZIRC 2020
Visually Guided Balloon Popping with an Autonomous MAV at MBZIRC 2020
Marius Beul
S. Bultmann
Andre Rochow
R. Rosu
Daniel Schleich
Malte Splietker
Sven Behnke
50
8
0
28 Oct 2020
Model Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets
Model Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets
Kai Han
Yunhe Wang
Qiulin Zhang
Wei Zhang
Chunjing Xu
Tong Zhang
74
89
0
28 Oct 2020
A Statistical Framework for Low-bitwidth Training of Deep Neural
  Networks
A Statistical Framework for Low-bitwidth Training of Deep Neural Networks
Jianfei Chen
Yujie Gai
Z. Yao
Michael W. Mahoney
Joseph E. Gonzalez
MQ
73
59
0
27 Oct 2020
$μ$NAS: Constrained Neural Architecture Search for Microcontrollers
μμμNAS: Constrained Neural Architecture Search for Microcontrollers
Edgar Liberis
Łukasz Dudziak
Nicholas D. Lane
BDL
63
106
0
27 Oct 2020
Pre-trained Summarization Distillation
Pre-trained Summarization Distillation
Sam Shleifer
Alexander M. Rush
69
103
0
24 Oct 2020
MARS: Multi-macro Architecture SRAM CIM-Based Accelerator with
  Co-designed Compressed Neural Networks
MARS: Multi-macro Architecture SRAM CIM-Based Accelerator with Co-designed Compressed Neural Networks
Syuan-Hao Sie
Jye-Luen Lee
Yi-Ren Chen
Chih-Cheng Lu
C. Hsieh
Meng-Fan Chang
K. Tang
36
14
0
24 Oct 2020
Adaptive Pixel-wise Structured Sparse Network for Efficient CNNs
Adaptive Pixel-wise Structured Sparse Network for Efficient CNNs
Chen Tang
Wenyu Sun
Zhuqing Yuan
Yongpan Liu
30
0
0
21 Oct 2020
Characterizing and Taming Model Instability Across Edge Devices
Characterizing and Taming Model Instability Across Edge Devices
Eyal Cidon
Evgenya Pergament
Zain Asgar
Asaf Cidon
Sachin Katti
63
7
0
18 Oct 2020
CrypTFlow2: Practical 2-Party Secure Inference
CrypTFlow2: Practical 2-Party Secure Inference
Deevashwer Rathee
Mayank Rathee
Nishant Kumar
Nishanth Chandran
Divya Gupta
Aseem Rastogi
Rahul Sharma
139
319
0
13 Oct 2020
S3ML: A Secure Serving System for Machine Learning Inference
S3ML: A Secure Serving System for Machine Learning Inference
Junming Ma
Chaofan Yu
Aihui Zhou
Bingzhe Wu
Xibin Wu
Xingyu Chen
Xiangqun Chen
Lei Wang
Donggang Cao
43
3
0
13 Oct 2020
Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in
  Image Classification
Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification
Yulin Wang
Kangchen Lv
Rui Huang
Shiji Song
Le Yang
Gao Huang
3DH
57
151
0
11 Oct 2020
Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge
  Transfer
Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer
Mahdi Ghorbani
Fahimeh Fooladgar
S. Kasaei
AAML
53
0
0
09 Oct 2020
Real-time Mask Detection on Google Edge TPU
Real-time Mask Detection on Google Edge TPU
Keondo Park
Won Jang
Woochul Lee
K. Nam
Kihong Seong
Kyuwook Chai
Wen-Syan Li
52
13
0
09 Oct 2020
Characterising Bias in Compressed Models
Characterising Bias in Compressed Models
Sara Hooker
Nyalleng Moorosi
Gregory Clark
Samy Bengio
Emily L. Denton
79
185
0
06 Oct 2020
A Survey on Deep Neural Network Compression: Challenges, Overview, and
  Solutions
A Survey on Deep Neural Network Compression: Challenges, Overview, and Solutions
Rahul Mishra
Hari Prabhat Gupta
Tanima Dutta
66
93
0
05 Oct 2020
Joint Pruning & Quantization for Extremely Sparse Neural Networks
Joint Pruning & Quantization for Extremely Sparse Neural Networks
Po-Hsiang Yu
Sih-Sian Wu
Jan P. Klopp
Liang-Gee Chen
Shao-Yi Chien
MQ
79
16
0
05 Oct 2020
AttendNets: Tiny Deep Image Recognition Neural Networks for the Edge via
  Visual Attention Condensers
AttendNets: Tiny Deep Image Recognition Neural Networks for the Edge via Visual Attention Condensers
A. Wong
M. Famouri
M. Shafiee
66
20
0
30 Sep 2020
NITI: Training Integer Neural Networks Using Integer-only Arithmetic
NITI: Training Integer Neural Networks Using Integer-only Arithmetic
Maolin Wang
Seyedramin Rasoulinezhad
Philip H. W. Leong
Hayden Kwok-Hay So
MQ
49
41
0
28 Sep 2020
Towards Fully 8-bit Integer Inference for the Transformer Model
Towards Fully 8-bit Integer Inference for the Transformer Model
Ye Lin
Yanyang Li
Tengbo Liu
Tong Xiao
Tongran Liu
Jingbo Zhu
MQ
78
63
0
17 Sep 2020
Extremely Low Bit Transformer Quantization for On-Device Neural Machine
  Translation
Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation
Insoo Chung
Byeongwook Kim
Yoonjung Choi
S. Kwon
Yongkweon Jeon
Baeseong Park
Sangha Kim
Dongsoo Lee
MQ
95
27
0
16 Sep 2020
Previous
123...202122...242526
Next