Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.05877
Cited By
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
15 December 2017
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"
50 / 1,298 papers shown
Title
Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes
Sanghyun Hong
Michael-Andrei Panaitescu-Liess
Yigitcan Kaya
Tudor Dumitras
MQ
82
13
0
26 Oct 2021
Applications and Techniques for Fast Machine Learning in Science
A. Deiana
Nhan Tran
Joshua C. Agar
Michaela Blott
G. D. Guglielmo
...
Ashish Sharma
S. Summers
Pietro Vischia
J. Vlimant
Olivia Weng
94
72
0
25 Oct 2021
A TinyML Platform for On-Device Continual Learning with Quantized Latent Replays
Leonardo Ravaglia
Manuele Rusci
D. Nadalini
Alessandro Capotondi
Francesco Conti
Luca Benini
BDL
106
68
0
20 Oct 2021
EBJR: Energy-Based Joint Reasoning for Adaptive Inference
Mohammad Akbari
Amin Banitalebi-Dehkordi
Yong Zhang
BDL
MQ
82
7
0
20 Oct 2021
Dynamic Slimmable Denoising Network
Zutao Jiang
Changlin Li
Xiaojun Chang
Jihua Zhu
Yi Yang
AI4CE
31
16
0
17 Oct 2021
Hydra: A System for Large Multi-Model Deep Learning
Kabir Nagrecha
Arun Kumar
MoE
AI4CE
73
5
0
16 Oct 2021
Differentiable Network Pruning for Microcontrollers
Edgar Liberis
Nicholas D. Lane
91
22
0
15 Oct 2021
Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations
Xinyu Zhang
Ian Colbert
Ken Kreutz-Delgado
Srinjoy Das
MQ
100
12
0
15 Oct 2021
PTQ-SL: Exploring the Sub-layerwise Post-training Quantization
Zhihang Yuan
Yiqi Chen
Chenhao Xue
Chenguang Zhang
Qiankun Wang
Guangyu Sun
MQ
28
3
0
15 Oct 2021
Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization
Weihan Chen
Peisong Wang
Jian Cheng
MQ
93
69
0
13 Oct 2021
Memory-Efficient CNN Accelerator Based on Interlayer Feature Map Compression
Zhuang Shao
Xiaoliang Chen
Li Du
Lei Chen
Yuan Du
Weihao Zhuang
Huadong Wei
Chenjia Xie
Zhongfeng Wang
40
27
0
12 Oct 2021
LightSeq2: Accelerated Training for Transformer-based Models on GPUs
Xiaohui Wang
Yang Wei
Ying Xiong
Guyue Huang
Xian Qian
Yufei Ding
Mingxuan Wang
Lei Li
VLM
62
33
0
12 Oct 2021
LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time
Elvis Nunez
Maxwell Horton
Anish K. Prabhu
Anurag Ranjan
Ali Farhadi
Mohammad Rastegari
68
4
0
08 Oct 2021
Token Pooling in Vision Transformers
D. Marin
Jen-Hao Rick Chang
Anurag Ranjan
Anish K. Prabhu
Mohammad Rastegari
Oncel Tuzel
ViT
143
71
0
08 Oct 2021
The Low-Resource Double Bind: An Empirical Study of Pruning for Low-Resource Machine Translation
Orevaoghene Ahia
Julia Kreutzer
Sara Hooker
188
55
0
06 Oct 2021
Shifting Capsule Networks from the Cloud to the Deep Edge
Miguel Costa
Diogo Costa
T. Gomes
Sandro Pinto
86
6
0
06 Oct 2021
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
Sachin Mehta
Mohammad Rastegari
ViT
302
1,300
0
05 Oct 2021
Pre-Quantized Deep Learning Models Codified in ONNX to Enable Hardware/Software Co-Design
U. Hanebutte
Andrew Baldwin
S. Duraković
I. Filipovich
Chien-Chun Chou
Chou
Damian Adamowicz
Derek Chickles
David Hawkes
MQ
32
2
0
04 Oct 2021
Progressive Transmission and Inference of Deep Learning Models
Youngsoo Lee
Sangdoo Yun
Yeonghun Kim
Sunghee Choi
44
2
0
03 Oct 2021
SECDA: Efficient Hardware/Software Co-Design of FPGA-based DNN Accelerators for Edge Inference
Jude Haris
Perry Gibson
José Cano
Nicolas Bohm Agostini
David Kaeli
91
19
0
01 Oct 2021
Semi-tensor Product-based TensorDecomposition for Neural Network Compression
Hengling Zhao
Yipeng Liu
Xiaolin Huang
Ce Zhu
79
6
0
30 Sep 2021
Convolutional Neural Network Compression through Generalized Kronecker Product Decomposition
Marawan Gamal Abdel Hameed
Marzieh S. Tahaei
A. Mosleh
V. Nia
90
26
0
29 Sep 2021
Smart at what cost? Characterising Mobile Deep Neural Networks in the wild
Mario Almeida
Stefanos Laskaridis
Abhinav Mehrotra
Łukasz Dudziak
Ilias Leontiadis
Nicholas D. Lane
HAI
161
47
0
28 Sep 2021
Consistency Training of Multi-exit Architectures for Sensor Data
Aaqib Saeed
34
1
0
27 Sep 2021
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
83
146
0
27 Sep 2021
Deep Structured Instance Graph for Distilling Object Detectors
Yixin Chen
Pengguang Chen
Shu Liu
Liwei Wang
Jiaya Jia
ObjD
ISeg
61
12
0
27 Sep 2021
Chess AI: Competing Paradigms for Machine Intelligence
Shivanand Maharaj
Nicholas G. Polson
Alex Turk
ELM
88
26
0
23 Sep 2021
DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers
Changlin Li
Guangrun Wang
Bing Wang
Xiaodan Liang
Zhihui Li
Xiaojun Chang
96
9
0
21 Sep 2021
Robustness Analysis of Deep Learning Frameworks on Mobile Platforms
Amin Eslami Abyane
Hadi Hemmati
AAML
77
3
0
20 Sep 2021
Towards Energy-Efficient and Secure Edge AI: A Cross-Layer Framework
Mohamed Bennai
Alberto Marchisio
Rachmad Vidya Wicaksana Putra
Muhammad Abdullah Hanif
98
34
0
20 Sep 2021
iRNN: Integer-only Recurrent Neural Network
Eyyub Sari
Vanessa Courville
V. Nia
MQ
85
4
0
20 Sep 2021
HPTQ: Hardware-Friendly Post Training Quantization
H. Habi
Reuven Peretz
Elad Cohen
Lior Dikstein
Oranit Dror
I. Diamant
Roy H. Jennings
Arnon Netzer
MQ
83
9
0
19 Sep 2021
Comfetch: Federated Learning of Large Networks on Constrained Clients via Sketching
Tahseen Rabbani
Brandon Yushan Feng
Marco Bornstein
Kyle Rui Sang
Yifan Yang
Arjun Rajkumar
A. Varshney
Furong Huang
FedML
119
2
0
17 Sep 2021
Phrase Retrieval Learns Passage Retrieval, Too
Jinhyuk Lee
Alexander Wettig
Danqi Chen
RALM
DML
82
48
0
16 Sep 2021
Complexity-aware Adaptive Training and Inference for Edge-Cloud Distributed AI Systems
Yinghan Long
I. Chakraborty
G. Srinivasan
Kaushik Roy
61
15
0
14 Sep 2021
2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both Adversarial Robustness and Efficiency
Yonggan Fu
Yang Zhao
Qixuan Yu
Chaojian Li
Yingyan Lin
AAML
170
14
0
11 Sep 2021
Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning
Prasetya Ajie Utama
N. Moosavi
Victor Sanh
Iryna Gurevych
AAML
128
36
0
09 Sep 2021
Elastic Significant Bit Quantization and Acceleration for Deep Neural Networks
Cheng Gong
Ye Lu
Kunpeng Xie
Zongming Jin
Tao Li
Yanzhi Wang
MQ
64
7
0
08 Sep 2021
BioNetExplorer: Architecture-Space Exploration of Bio-Signal Processing Deep Neural Networks for Wearables
B. Prabakaran
Asima Akhtar
Semeen Rehman
Osman Hasan
Mohamed Bennai
26
10
0
07 Sep 2021
GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization
Yi Guo
Huan Yuan
Jianchao Tan
Zhangyang Wang
Sen Yang
Ji Liu
92
46
0
06 Sep 2021
Spiking Neural Networks with Improved Inherent Recurrence Dynamics for Sequential Learning
Wachirawit Ponghiran
Kaushik Roy
115
49
0
04 Sep 2021
On the Accuracy of Analog Neural Network Inference Accelerators
T. Xiao
Ben Feinberg
C. Bennett
V. Prabhakar
Prashant Saxena
V. Agrawal
S. Agarwal
M. Marinella
54
41
0
03 Sep 2021
Diverse Sample Generation: Pushing the Limit of Generative Data-free Quantization
Haotong Qin
Yifu Ding
Xiangguo Zhang
Jiakai Wang
Xianglong Liu
Jiwen Lu
DiffM
MQ
67
57
0
01 Sep 2021
Architecture Aware Latency Constrained Sparse Neural Networks
Tianli Zhao
Qinghao Hu
Xiangyu He
Weixiang Xu
Jiaxing Wang
Cong Leng
Jian Cheng
67
0
0
01 Sep 2021
Pruning with Compensation: Efficient Channel Pruning for Deep Convolutional Neural Networks
Zhouyang Xie
Yan Fu
Sheng-Zhao Tian
Junlin Zhou
Duanbing Chen
3DV
48
0
0
31 Aug 2021
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Amin Banitalebi-Dehkordi
Naveen Vedula
J. Pei
Fei Xia
Lanjun Wang
Yong Zhang
79
92
0
30 Aug 2021
Compact representations of convolutional neural networks via weight pruning and quantization
Giosuè Cataldo Marinò
A. Petrini
D. Malchiodi
Marco Frasca
MQ
23
4
0
28 Aug 2021
Greenformers: Improving Computation and Memory Efficiency in Transformer Models via Low-Rank Approximation
Samuel Cahyawijaya
103
12
0
24 Aug 2021
On the Acceleration of Deep Neural Network Inference using Quantized Compressed Sensing
Meshia Cédric Oveneke
MQ
49
0
0
23 Aug 2021
Supervised Compression for Resource-Constrained Edge Computing Systems
Yoshitomo Matsubara
Ruihan Yang
Marco Levorato
Stephan Mandt
118
58
0
21 Aug 2021
Previous
1
2
3
...
16
17
18
...
24
25
26
Next