ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.00822
  4. Cited By
Quantized Neural Network Inference with Precision Batching

Quantized Neural Network Inference with Precision Batching

26 February 2020
Maximilian Lam
Zachary Yedidia
Colby R. Banbury
Vijay Janapa Reddi
    MQ
ArXivPDFHTML

Papers citing "Quantized Neural Network Inference with Precision Batching"

25 / 25 papers shown
Title
AdaptivFloat: A Floating-point based Data Type for Resilient Deep
  Learning Inference
AdaptivFloat: A Floating-point based Data Type for Resilient Deep Learning Inference
Thierry Tambe
En-Yu Yang
Zishen Wan
Yuntian Deng
Vijay Janapa Reddi
Alexander M. Rush
David Brooks
Gu-Yeon Wei
MQ
32
21
0
29 Sep 2019
Learning Fast Algorithms for Linear Transforms Using Butterfly
  Factorizations
Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations
Tri Dao
Albert Gu
Matthew Eichhorn
Atri Rudra
Christopher Ré
93
106
0
14 Mar 2019
Improving Neural Network Quantization without Retraining using Outlier
  Channel Splitting
Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
Ritchie Zhao
Yuwei Hu
Jordan Dotzel
Christopher De Sa
Zhiru Zhang
OODD
MQ
77
308
0
28 Jan 2019
Rethinking floating point for deep learning
Rethinking floating point for deep learning
Jeff Johnson
MQ
92
138
0
01 Nov 2018
PACT: Parameterized Clipping Activation for Quantized Neural Networks
PACT: Parameterized Clipping Activation for Quantized Neural Networks
Jungwook Choi
Zhuo Wang
Swagath Venkataramani
P. Chuang
Vijayalakshmi Srinivasan
K. Gopalakrishnan
MQ
58
947
0
16 May 2018
Word2Bits - Quantized Word Vectors
Word2Bits - Quantized Word Vectors
Maximilian Lam
MQ
42
26
0
15 Mar 2018
Alternating Multi-bit Quantization for Recurrent Neural Networks
Alternating Multi-bit Quantization for Recurrent Neural Networks
Chen Xu
Jianqiang Yao
Zhouchen Lin
Wenwu Ou
Yuanbin Cao
Zhirong Wang
H. Zha
MQ
67
116
0
01 Feb 2018
Bit Fusion: Bit-Level Dynamically Composable Architecture for
  Accelerating Deep Neural Networks
Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks
Hardik Sharma
Jongse Park
Naveen Suda
Liangzhen Lai
Benson Chau
Joo-Young Kim
Vikas Chandra
H. Esmaeilzadeh
MQ
53
490
0
05 Dec 2017
Streamlined Deployment for Quantized Neural Networks
Streamlined Deployment for Quantized Neural Networks
Yaman Umuroglu
Magnus Jahre
MQ
41
36
0
12 Sep 2017
On the State of the Art of Evaluation in Neural Language Models
On the State of the Art of Evaluation in Neural Language Models
Gábor Melis
Chris Dyer
Phil Blunsom
59
534
0
18 Jul 2017
Training Quantized Nets: A Deeper Understanding
Training Quantized Nets: A Deeper Understanding
Hao Li
Soham De
Zheng Xu
Christoph Studer
H. Samet
Tom Goldstein
MQ
50
210
0
07 Jun 2017
Efficient Processing of Deep Neural Networks: A Tutorial and Survey
Efficient Processing of Deep Neural Networks: A Tutorial and Survey
Vivienne Sze
Yu-hsin Chen
Tien-Ju Yang
J. Emer
AAML
3DV
113
3,013
0
27 Mar 2017
Incremental Network Quantization: Towards Lossless CNNs with
  Low-Precision Weights
Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights
Aojun Zhou
Anbang Yao
Yiwen Guo
Lin Xu
Yurong Chen
MQ
382
1,050
0
10 Feb 2017
Trained Ternary Quantization
Trained Ternary Quantization
Chenzhuo Zhu
Song Han
Huizi Mao
W. Dally
MQ
131
1,035
0
04 Dec 2016
Bit-pragmatic Deep Neural Network Computing
Bit-pragmatic Deep Neural Network Computing
Jorge Albericio
Patrick Judd
A. Delmas
Sayeh Sharify
Andreas Moshovos
MQ
76
239
0
20 Oct 2016
Pointer Sentinel Mixture Models
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
252
2,842
0
26 Sep 2016
Quantized Neural Networks: Training Neural Networks with Low Precision
  Weights and Activations
Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations
Itay Hubara
Matthieu Courbariaux
Daniel Soudry
Ran El-Yaniv
Yoshua Bengio
MQ
132
1,861
0
22 Sep 2016
SQuAD: 100,000+ Questions for Machine Comprehension of Text
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar
Jian Zhang
Konstantin Lopyrev
Percy Liang
RALM
205
8,113
0
16 Jun 2016
EIE: Efficient Inference Engine on Compressed Deep Neural Network
EIE: Efficient Inference Engine on Compressed Deep Neural Network
Song Han
Xingyu Liu
Huizi Mao
Jing Pu
A. Pedram
M. Horowitz
W. Dally
118
2,455
0
04 Feb 2016
Learning Natural Language Inference with LSTM
Learning Natural Language Inference with LSTM
Shuohang Wang
Jing Jiang
89
445
0
30 Dec 2015
Neural Networks with Few Multiplications
Neural Networks with Few Multiplications
Zhouhan Lin
Matthieu Courbariaux
Roland Memisevic
Yoshua Bengio
82
331
0
11 Oct 2015
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han
Huizi Mao
W. Dally
3DGS
227
8,821
0
01 Oct 2015
A large annotated corpus for learning natural language inference
A large annotated corpus for learning natural language inference
Samuel R. Bowman
Gabor Angeli
Christopher Potts
Christopher D. Manning
261
4,278
0
21 Aug 2015
Learning both Weights and Connections for Efficient Neural Networks
Learning both Weights and Connections for Efficient Neural Networks
Song Han
Jeff Pool
J. Tran
W. Dally
CVBM
279
6,657
0
08 Jun 2015
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever
Oriol Vinyals
Quoc V. Le
AIMat
337
20,518
0
10 Sep 2014
1