ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.05877
  4. Cited By
Quantization and Training of Neural Networks for Efficient
  Integer-Arithmetic-Only Inference

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

15 December 2017
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
    MQ
ArXiv (abs)PDFHTML

Papers citing "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"

50 / 1,298 papers shown
Title
Towards Fast Single-Trial Online ERP based Brain-Computer Interface
  using dry EEG electrodes and neural networks: a pilot study
Towards Fast Single-Trial Online ERP based Brain-Computer Interface using dry EEG electrodes and neural networks: a pilot study
Okba Bekhelifi
N. Berrached
29
3
0
04 Nov 2022
DAD vision: opto-electronic co-designed computer vision with division
  adjoint method
DAD vision: opto-electronic co-designed computer vision with division adjoint method
Zihan Zang
Hao Wang
Yunpeng Xu
29
0
0
04 Nov 2022
Efficient Spatially Sparse Inference for Conditional GANs and Diffusion
  Models
Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models
Zhekai Zhang
Ji Lin
Chenlin Meng
Stefano Ermon
Song Han
Jun-Yan Zhu
DiffM
140
49
0
03 Nov 2022
Self Similarity Matrix based CNN Filter Pruning
Self Similarity Matrix based CNN Filter Pruning
S. Rakshith
Jayesh Rajkumar Vachhani
Sourabh Vasant Gothe
Rishabh Khurana
47
0
0
03 Nov 2022
Edge Impulse: An MLOps Platform for Tiny Machine Learning
Edge Impulse: An MLOps Platform for Tiny Machine Learning
Shawn Hymel
Colby R. Banbury
Daniel Situnayake
A. Elium
Carl Ward
...
Louis Moreau
Dmitry Maslov
A. Beavis
Jan Jongboom
Vijay Janapa Reddi
VLMLRM
116
101
0
02 Nov 2022
QuaLA-MiniLM: a Quantized Length Adaptive MiniLM
QuaLA-MiniLM: a Quantized Length Adaptive MiniLM
Shira Guskin
Moshe Wasserblat
Chang Wang
Haihao Shen
MQ
76
2
0
31 Oct 2022
Block-Wise Dynamic-Precision Neural Network Training Acceleration via
  Online Quantization Sensitivity Analytics
Block-Wise Dynamic-Precision Neural Network Training Acceleration via Online Quantization Sensitivity Analytics
Ruoyang Liu
Chenhan Wei
Yixiong Yang
Wenxun Wang
Huazhong Yang
Yongpan Liu
MQ
63
3
0
31 Oct 2022
MinUn: Accurate ML Inference on Microcontrollers
MinUn: Accurate ML Inference on Microcontrollers
Shikhar Jaiswal
R. Goli
Aayan Kumar
Vivek Seshadri
Rahul Sharma
88
3
0
29 Oct 2022
Fast DistilBERT on CPUs
Fast DistilBERT on CPUs
Haihao Shen
Ofir Zafrir
Bo Dong
Hengyu Meng
Xinyu. Ye
Zhe Wang
Yi Ding
Hanwen Chang
Guy Boudoukh
Moshe Wasserblat
VLM
60
2
0
27 Oct 2022
Zero-Shot Learning of a Conditional Generative Adversarial Network for
  Data-Free Network Quantization
Zero-Shot Learning of a Conditional Generative Adversarial Network for Data-Free Network Quantization
Yoojin Choi
Mostafa El-Khamy
Jungwon Lee
GAN
56
1
0
26 Oct 2022
Weight Fixing Networks
Weight Fixing Networks
Christopher Subia-Waud
S. Dasmahapatra
MQ
75
2
0
24 Oct 2022
TPU-MLIR: A Compiler For TPU Using MLIR
TPU-MLIR: A Compiler For TPU Using MLIR
Pengchao Hu
Man Lu
Lei Wang
Guoyue Jiang
31
5
0
23 Oct 2022
Real-Time Multi-Modal Semantic Fusion on Unmanned Aerial Vehicles with
  Label Propagation for Cross-Domain Adaptation
Real-Time Multi-Modal Semantic Fusion on Unmanned Aerial Vehicles with Label Propagation for Cross-Domain Adaptation
S. Bultmann
Jan Quenzel
Sven Behnke
70
18
0
18 Oct 2022
Scaling & Shifting Your Features: A New Baseline for Efficient Model
  Tuning
Scaling & Shifting Your Features: A New Baseline for Efficient Model Tuning
Dongze Lian
Daquan Zhou
Jiashi Feng
Xinchao Wang
122
264
0
17 Oct 2022
FIT: A Metric for Model Sensitivity
FIT: A Metric for Model Sensitivity
Ben Zandonati
Adrian Alan Pol
M. Pierini
Olya Sirkin
Tal Kopetz
MQ
81
8
0
16 Oct 2022
FAQS: Communication-efficient Federate DNN Architecture and Quantization
  Co-Search for personalized Hardware-aware Preferences
FAQS: Communication-efficient Federate DNN Architecture and Quantization Co-Search for personalized Hardware-aware Preferences
Hongjiang Chen
Yang Wang
Leibo Liu
Shaojun Wei
Shouyi Yin
FedMLMQ
39
0
0
16 Oct 2022
Just Round: Quantized Observation Spaces Enable Memory Efficient
  Learning of Dynamic Locomotion
Just Round: Quantized Observation Spaces Enable Memory Efficient Learning of Dynamic Locomotion
Lev Grossman
Brian Plancher
MQ
69
4
0
14 Oct 2022
ENTS: An Edge-native Task Scheduling System for Collaborative Edge
  Computing
ENTS: An Edge-native Task Scheduling System for Collaborative Edge Computing
Mingjin Zhang
Jiannong Cao
Lei Yang
Li Zhang
Yuvraj Sahni
Shan Jiang
34
21
0
14 Oct 2022
Accelerating RNN-based Speech Enhancement on a Multi-Core MCU with Mixed
  FP16-INT8 Post-Training Quantization
Accelerating RNN-based Speech Enhancement on a Multi-Core MCU with Mixed FP16-INT8 Post-Training Quantization
Manuele Rusci
Marco Fariselli
Martin Croome
Francesco Paci
Eric Flamand
MQ
62
12
0
14 Oct 2022
SQuAT: Sharpness- and Quantization-Aware Training for BERT
SQuAT: Sharpness- and Quantization-Aware Training for BERT
Zheng Wang
Juncheng Billy Li
Shuhui Qu
Florian Metze
Emma Strubell
MQ
42
7
0
13 Oct 2022
SeKron: A Decomposition Method Supporting Many Factorization Structures
SeKron: A Decomposition Method Supporting Many Factorization Structures
Marawan Gamal Abdel Hameed
A. Mosleh
Marzieh S. Tahaei
V. Nia
59
1
0
12 Oct 2022
TriangleNet: Edge Prior Augmented Network for Semantic Segmentation
  through Cross-Task Consistency
TriangleNet: Edge Prior Augmented Network for Semantic Segmentation through Cross-Task Consistency
Dan Zhang
Rui Zheng
Luosang Gadeng
Pei Yang
87
1
0
11 Oct 2022
Training Spiking Neural Networks with Local Tandem Learning
Training Spiking Neural Networks with Local Tandem Learning
Qu Yang
Jibin Wu
Malu Zhang
Yansong Chua
Xinchao Wang
Haizhou Li
106
41
0
10 Oct 2022
Low Error-Rate Approximate Multiplier Design for DNNs with
  Hardware-Driven Co-Optimization
Low Error-Rate Approximate Multiplier Design for DNNs with Hardware-Driven Co-Optimization
Yao Lu
Jide Zhang
Su Zheng
Zhen Li
Lingli Wang
MQ
32
2
0
08 Oct 2022
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of
  Large-Scale Pre-Trained Language Models
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
S. Kwon
Jeonghoon Kim
Jeongin Bae
Kang Min Yoo
Jin-Hwa Kim
Baeseong Park
Byeongwook Kim
Jung-Woo Ha
Nako Sung
Dongsoo Lee
MQ
119
31
0
08 Oct 2022
A Closer Look at Hardware-Friendly Weight Quantization
A Closer Look at Hardware-Friendly Weight Quantization
Sungmin Bae
Piotr Zielinski
S. Chatterjee
MQ
62
0
0
07 Oct 2022
Inference Latency Prediction at the Edge
Inference Latency Prediction at the Edge
Zhuojin Li
Marco Paolieri
L. Golubchik
64
3
0
06 Oct 2022
Limitations of neural network training due to numerical instability of
  backpropagation
Limitations of neural network training due to numerical instability of backpropagation
Clemens Karner
V. Kazeev
P. Petersen
78
3
0
03 Oct 2022
Basic Binary Convolution Unit for Binarized Image Restoration Network
Basic Binary Convolution Unit for Binarized Image Restoration Network
Bin Xia
Yulun Zhang
Yitong Wang
Yapeng Tian
Wenming Yang
Radu Timofte
Luc Van Gool
MQ
80
23
0
02 Oct 2022
Convolutional Neural Networks Quantization with Attention
Convolutional Neural Networks Quantization with Attention
Binyi Wu
Bernd Waschneck
Christian Mayr
MQ
93
2
0
30 Sep 2022
Verifiable and Energy Efficient Medical Image Analysis with Quantised
  Self-attentive Deep Neural Networks
Verifiable and Energy Efficient Medical Image Analysis with Quantised Self-attentive Deep Neural Networks
Rakshith Sathish
S. Khare
Debdoot Sheet
53
4
0
30 Sep 2022
Tuning of Mixture-of-Experts Mixed-Precision Neural Networks
Tuning of Mixture-of-Experts Mixed-Precision Neural Networks
Fabian Tschopp
FedMLMoE
13
0
0
29 Sep 2022
FoVolNet: Fast Volume Rendering using Foveated Deep Neural Networks
FoVolNet: Fast Volume Rendering using Foveated Deep Neural Networks
David Bauer
Qi Wu
Kwan-Liu Ma
3DH
89
19
0
20 Sep 2022
Understanding Real-world Threats to Deep Learning Models in Android Apps
Understanding Real-world Threats to Deep Learning Models in Android Apps
Zizhuang Deng
Kai Chen
Guozhu Meng
Xiaodong Zhang
Ke Xu
Yao Cheng
AAML
70
29
0
20 Sep 2022
SAMP: A Model Inference Toolkit of Post-Training Quantization for Text
  Processing via Self-Adaptive Mixed-Precision
SAMP: A Model Inference Toolkit of Post-Training Quantization for Text Processing via Self-Adaptive Mixed-Precision
Rong Tian
Zijing Zhao
Weijie Liu
Haoyan Liu
Weiquan Mao
Zhe Zhao
Kimmo Yan
MQ
52
5
0
19 Sep 2022
Analysis of Quantization on MLP-based Vision Models
Analysis of Quantization on MLP-based Vision Models
Lingran Zhao
Zhen Dong
Kurt Keutzer
MQ
64
7
0
14 Sep 2022
PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for
  Vision Transformers
PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformers
Zhikai Li
Mengjuan Chen
Junrui Xiao
Qingyi Gu
ViTMQ
127
35
0
13 Sep 2022
Hardware Accelerator and Neural Network Co-Optimization for
  Ultra-Low-Power Audio Processing Devices
Hardware Accelerator and Neural Network Co-Optimization for Ultra-Low-Power Audio Processing Devices
Christoph Gerum
Adrian Frischknecht
T. Hald
Paul Palomero Bernardo
Konstantin Lubeck
Oliver Bringmann
83
10
0
08 Sep 2022
Generative Adversarial Super-Resolution at the Edge with Knowledge
  Distillation
Generative Adversarial Super-Resolution at the Edge with Knowledge Distillation
Simone Angarano
Francesco Salvetti
Mauro Martini
Marcello Chiaberge
GAN
110
22
0
07 Sep 2022
Ultra-low-power Range Error Mitigation for Ultra-wideband Precise
  Localization
Ultra-low-power Range Error Mitigation for Ultra-wideband Precise Localization
Simone Angarano
Francesco Salvetti
Vittorio Mazzia
Giovanni Fantin
Dario Gandini
Marcello Chiaberge
76
3
0
07 Sep 2022
Side-channel attack analysis on in-memory computing architectures
Side-channel attack analysis on in-memory computing architectures
Ziyu Wang
Fanruo Meng
Yongmo Park
Jason K. Eshraghian
Wei D. Lu
113
22
0
06 Sep 2022
PulseDL-II: A System-on-Chip Neural Network Accelerator for Timing and
  Energy Extraction of Nuclear Detector Signals
PulseDL-II: A System-on-Chip Neural Network Accelerator for Timing and Energy Extraction of Nuclear Detector Signals
P. Ai
Z. Deng
Yi Wang
H. Gong
Xinchi Ran
Z. Lang
59
3
0
02 Sep 2022
Human Activity Recognition on Microcontrollers with Quantized and
  Adaptive Deep Neural Networks
Human Activity Recognition on Microcontrollers with Quantized and Adaptive Deep Neural Networks
Francesco Daghero
Luca Bompani
Chen Xie
Marco Castellano
Luca Gandolfi
A. Calimera
Enrico Macii
Massimo Poncino
Daniele Jahier Pagliari
BDLHAI
63
24
0
02 Sep 2022
On Quantizing Implicit Neural Representations
On Quantizing Implicit Neural Representations
Cameron Gordon
Shin-Fang Chng
L. MacDonald
Simon Lucey
MQ
88
21
0
01 Sep 2022
QuantNAS for super resolution: searching for efficient
  quantization-friendly architectures against quantization noise
QuantNAS for super resolution: searching for efficient quantization-friendly architectures against quantization noise
Egor Shvetsov
Dmitry Osin
Alexey Zaytsev
Ivan Koryakovskiy
Valentin Buchnev
I. Trofimov
Evgeny Burnaev
MQ
99
2
0
31 Aug 2022
XCAT -- Lightweight Quantized Single Image Super-Resolution using
  Heterogeneous Group Convolutions and Cross Concatenation
XCAT -- Lightweight Quantized Single Image Super-Resolution using Heterogeneous Group Convolutions and Cross Concatenation
Mustafa Ayazoglu
Bahri Batuhan Bilecen
SupR
106
4
0
31 Aug 2022
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural
  Network Quantization
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization
Cong Guo
Chen Zhang
Jingwen Leng
Zihan Liu
Fan Yang
Yun-Bo Liu
Minyi Guo
Yuhao Zhu
MQ
88
60
0
30 Aug 2022
Reducing Computational Complexity of Neural Networks in Optical Channel
  Equalization: From Concepts to Implementation
Reducing Computational Complexity of Neural Networks in Optical Channel Equalization: From Concepts to Implementation
Pedro J. Freire
A. Napoli
D. A. Ron
B. Spinnler
M. Anderson
W. Schairer
T. Bex
N. Costa
S. Turitsyn
Jaroslaw E. Prilepsky
78
29
0
26 Aug 2022
GHN-Q: Parameter Prediction for Unseen Quantized Convolutional
  Architectures via Graph Hypernetworks
GHN-Q: Parameter Prediction for Unseen Quantized Convolutional Architectures via Graph Hypernetworks
S. Yun
Alexander Wong
GNNMQ
33
1
0
26 Aug 2022
Efficient Adaptive Activation Rounding for Post-Training Quantization
Efficient Adaptive Activation Rounding for Post-Training Quantization
Zhengyi Li
Cong Guo
Zhanda Zhu
Yangjie Zhou
Yuxian Qiu
Xiaotian Gao
Jingwen Leng
Minyi Guo
MQ
96
4
0
25 Aug 2022
Previous
123...121314...242526
Next