ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.05877
  4. Cited By
Quantization and Training of Neural Networks for Efficient
  Integer-Arithmetic-Only Inference

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

15 December 2017
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
    MQ
ArXiv (abs)PDFHTML

Papers citing "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"

50 / 1,298 papers shown
Title
QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large
  Language Models
QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models
Jing Liu
Ruihao Gong
Xiuying Wei
Zhiwei Dong
Jianfei Cai
Bohan Zhuang
MQ
94
53
0
12 Oct 2023
AutoFHE: Automated Adaption of CNNs for Efficient Evaluation over FHE
AutoFHE: Automated Adaption of CNNs for Efficient Evaluation over FHE
Wei Ao
Vishnu Boddeti
AAML
82
21
0
12 Oct 2023
QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources
QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources
Zhikai Li
Xiaoxuan Liu
Banghua Zhu
Zhen Dong
Qingyi Gu
Kurt Keutzer
MQ
104
7
0
11 Oct 2023
Segmented Harmonic Loss: Handling Class-Imbalanced Multi-Label Clinical
  Data for Medical Coding with Large Language Models
Segmented Harmonic Loss: Handling Class-Imbalanced Multi-Label Clinical Data for Medical Coding with Large Language Models
Surjya Ray
Pratik Mehta
Hongen Zhang
Ada Chaman
Jian Wang
Chung-Jen Ho
Michael Chiou
T. Suleman
31
1
0
06 Oct 2023
Quantized Transformer Language Model Implementations on Edge Devices
Quantized Transformer Language Model Implementations on Edge Devices
Mohammad Wali Ur Rahman
Murad Mehrab Abrar
Hunter Gibbons Copening
Salim Hariri
Sicong Shao
Pratik Satam
Soheil Salehi
MQ
68
11
0
06 Oct 2023
EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit
  Diffusion Models
EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models
Yefei He
Jing Liu
Weijia Wu
Hong Zhou
Bohan Zhuang
DiffMMQ
143
51
0
05 Oct 2023
A Study of Quantisation-aware Training on Time Series Transformer Models
  for Resource-constrained FPGAs
A Study of Quantisation-aware Training on Time Series Transformer Models for Resource-constrained FPGAs
Tianheng Ling
Chao Qian
Lukas Einhaus
Gregor Schiele
31
1
0
04 Oct 2023
NOLA: Compressing LoRA using Linear Combination of Random Basis
NOLA: Compressing LoRA using Linear Combination of Random Basis
Soroush Abbasi Koohpayegani
K. Navaneet
Parsa Nooralinejad
Soheil Kolouri
Hamed Pirsiavash
141
16
0
04 Oct 2023
The Inhibitor: ReLU and Addition-Based Attention for Efficient
  Transformers
The Inhibitor: ReLU and Addition-Based Attention for Efficient Transformers
Rickard Brannvall
46
0
0
03 Oct 2023
Subtractor-Based CNN Inference Accelerator
Subtractor-Based CNN Inference Accelerator
Victor Gao
Issam Hammad
K. El-Sankary
Jason Gu
23
0
0
02 Oct 2023
FedAIoT: A Federated Learning Benchmark for Artificial Intelligence of
  Things
FedAIoT: A Federated Learning Benchmark for Artificial Intelligence of Things
Samiul Alam
Tuo Zhang
Tiantian Feng
Hui Shen
Zhichao Cao
...
JeongGil Ko
Kiran Somasundaram
Shrikanth S. Narayanan
Salman Avestimehr
Mi Zhang
121
11
0
29 Sep 2023
MixQuant: Mixed Precision Quantization with a Bit-width Optimization
  Search
MixQuant: Mixed Precision Quantization with a Bit-width Optimization Search
Yichen Xie
Wei Le
MQ
53
4
0
29 Sep 2023
Benchmarking and In-depth Performance Study of Large Language Models on
  Habana Gaudi Processors
Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors
Chengming Zhang
Baixi Sun
Xiaodong Yu
Zhen Xie
Weijian Zheng
K. Iskra
Pete Beckman
Dingwen Tao
50
5
0
29 Sep 2023
Highly Efficient SNNs for High-speed Object Detection
Highly Efficient SNNs for High-speed Object Detection
Nemin Qiu
Zhiguo Li
Yuan Li
Chuang Zhu
99
0
0
27 Sep 2023
Efficient Post-training Quantization with FP8 Formats
Efficient Post-training Quantization with FP8 Formats
Haihao Shen
Naveen Mellempudi
Xin He
Q. Gao
Chang‐Bao Wang
Mengni Wang
MQ
79
23
0
26 Sep 2023
GHN-QAT: Training Graph Hypernetworks to Predict Quantization-Robust
  Parameters of Unseen Limited Precision Neural Networks
GHN-QAT: Training Graph Hypernetworks to Predict Quantization-Robust Parameters of Unseen Limited Precision Neural Networks
S. Yun
Alexander Wong
MQ
64
0
0
24 Sep 2023
Causal-DFQ: Causality Guided Data-free Network Quantization
Causal-DFQ: Causality Guided Data-free Network Quantization
Yuzhang Shang
Bingxin Xu
Gaowen Liu
Ramana Rao Kompella
Yan Yan
MQCML
92
8
0
24 Sep 2023
Probabilistic Weight Fixing: Large-scale training of neural network
  weight uncertainties for quantization
Probabilistic Weight Fixing: Large-scale training of neural network weight uncertainties for quantization
Christopher Subia-Waud
S. Dasmahapatra
UQCVMQ
63
1
0
24 Sep 2023
Early-Exit with Class Exclusion for Efficient Inference of Neural
  Networks
Early-Exit with Class Exclusion for Efficient Inference of Neural Networks
Jing Wang
Bing Li
Grace Li Zhang
43
4
0
23 Sep 2023
A Machine Learning-oriented Survey on Tiny Machine Learning
A Machine Learning-oriented Survey on Tiny Machine Learning
Luigi Capogrosso
Federico Cunico
D. Cheng
Franco Fummi
Marco Cristani
SyDaMU
106
45
0
21 Sep 2023
Towards Real-Time Neural Video Codec for Cross-Platform Application
  Using Calibration Information
Towards Real-Time Neural Video Codec for Cross-Platform Application Using Calibration Information
Kuan Tian
Yonghang Guan
Jin-Peng Xiang
Jun Zhang
Xiao Han
Wei Yang
70
7
0
20 Sep 2023
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
Ofir Gordon
H. Habi
Arnon Netzer
MQ
73
0
0
20 Sep 2023
SPFQ: A Stochastic Algorithm and Its Error Analysis for Neural Network
  Quantization
SPFQ: A Stochastic Algorithm and Its Error Analysis for Neural Network Quantization
Jinjie Zhang
Rayan Saab
42
0
0
20 Sep 2023
Logic Design of Neural Networks for High-Throughput and Low-Power
  Applications
Logic Design of Neural Networks for High-Throughput and Low-Power Applications
Kangwei Xu
Grace Li Zhang
Ulf Schlichtmann
Bing Li
57
3
0
19 Sep 2023
Scaling Laws for Sparsely-Connected Foundation Models
Scaling Laws for Sparsely-Connected Foundation Models
Elias Frantar
C. Riquelme
N. Houlsby
Dan Alistarh
Utku Evci
116
38
0
15 Sep 2023
Accelerating Deep Neural Networks via Semi-Structured Activation
  Sparsity
Accelerating Deep Neural Networks via Semi-Structured Activation Sparsity
Matteo Grimaldi
Darshan C. Ganji
Ivan Lazarevich
Sudhakar Sah
61
10
0
12 Sep 2023
Real-Time Semantic Segmentation: A Brief Survey & Comparative Study in
  Remote Sensing
Real-Time Semantic Segmentation: A Brief Survey & Comparative Study in Remote Sensing
Clifford Broni-Bediako
Junshi Xia
Naoto Yokoya
89
10
0
12 Sep 2023
Quantized Non-Volatile Nanomagnetic Synapse based Autoencoder for
  Efficient Unsupervised Network Anomaly Detection
Quantized Non-Volatile Nanomagnetic Synapse based Autoencoder for Efficient Unsupervised Network Anomaly Detection
Muhammad Sabbir Alam
W. A. Misba
J. Atulasimha
19
1
0
12 Sep 2023
Soft Quantization using Entropic Regularization
Soft Quantization using Entropic Regularization
Rajmadan Lakshmanan
Alois Pichler
MQ
37
5
0
08 Sep 2023
CPU frequency scheduling of real-time applications on embedded devices
  with temporal encoding-based deep reinforcement learning
CPU frequency scheduling of real-time applications on embedded devices with temporal encoding-based deep reinforcement learning
Ti Zhou
Man Lin
59
4
0
07 Sep 2023
Memory Efficient Optimizers with 4-bit States
Memory Efficient Optimizers with 4-bit States
Bingrui Li
Jianfei Chen
Jun Zhu
MQ
87
40
0
04 Sep 2023
martFL: Enabling Utility-Driven Data Marketplace with a Robust and
  Verifiable Federated Learning Architecture
martFL: Enabling Utility-Driven Data Marketplace with a Robust and Verifiable Federated Learning Architecture
Qi Li
Zhuotao Liu
Qi Li
Ke Xu
95
17
0
03 Sep 2023
On-Device Learning with Binary Neural Networks
On-Device Learning with Binary Neural Networks
Lorenzo Vorabbi
Davide Maltoni
Stefano Santi
MQ
95
4
0
29 Aug 2023
Memory-aware Scheduling for Complex Wired Networks with Iterative Graph
  Optimization
Memory-aware Scheduling for Complex Wired Networks with Iterative Graph Optimization
Shuzhang Zhong
Meng Li
Yun Liang
Runsheng Wang
Ru Huang
GNN
36
3
0
26 Aug 2023
Homological Convolutional Neural Networks
Homological Convolutional Neural Networks
Antonio Briola
Yuanrong Wang
Silvia Bartolucci
T. Aste
LMTD
80
7
0
26 Aug 2023
A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance
A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance
Ian Colbert
Alessandro Pappalardo
Jakoba Petri-Koenig
MQ
111
9
0
25 Aug 2023
Jumping through Local Minima: Quantization in the Loss Landscape of
  Vision Transformers
Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
MQ
99
17
0
21 Aug 2023
QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D
  Object Detection
QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object Detection
Yifan Zhang
Zhen Dong
Huanrui Yang
Ming Lu
Cheng-Ching Tseng
Yuan Du
Kurt Keutzer
Li Du
Shanghang Zhang
MQ
75
9
0
21 Aug 2023
ResQ: Residual Quantization for Video Perception
ResQ: Residual Quantization for Video Perception
Davide Abati
H. Yahia
Markus Nagel
A. Habibian
MQ
40
2
0
18 Aug 2023
How Does Pruning Impact Long-Tailed Multi-Label Medical Image
  Classifiers?
How Does Pruning Impact Long-Tailed Multi-Label Medical Image Classifiers?
G. Holste
Ziyu Jiang
Ajay Jaiswal
Maria Hanna
Shlomo Minkowitz
...
Ying Ding
Ronald M. Summers
George Shih
Yifan Peng
Zhangyang Wang
113
1
0
17 Aug 2023
Unified Data-Free Compression: Pruning and Quantization without
  Fine-Tuning
Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning
Shipeng Bai
Jun Chen
Xintian Shen
Yixuan Qian
Yong Liu
MQ
97
15
0
14 Aug 2023
Efficient Neural PDE-Solvers using Quantization Aware Training
Efficient Neural PDE-Solvers using Quantization Aware Training
W.V.S.O. van den Dool
Tijmen Blankevoort
Max Welling
Yuki M. Asano
MQ
60
3
0
14 Aug 2023
Exploring Frequency-Inspired Optimization in Transformer for Efficient
  Single Image Super-Resolution
Exploring Frequency-Inspired Optimization in Transformer for Efficient Single Image Super-Resolution
Ao Li
Le Zhang
Yun-Hai Liu
Ce Zhu
74
11
0
09 Aug 2023
LoRA-FA: Memory-efficient Low-rank Adaptation for Large Language Models
  Fine-tuning
LoRA-FA: Memory-efficient Low-rank Adaptation for Large Language Models Fine-tuning
Longteng Zhang
Lin Zhang
Shaoshuai Shi
Xiaowen Chu
Yue Liu
AI4CE
74
108
0
07 Aug 2023
Tango: rethinking quantization for graph neural network training on GPUs
Tango: rethinking quantization for graph neural network training on GPUs
Shiyang Chen
Da Zheng
Caiwen Ding
Chengying Huan
Yuede Ji
Hang Liu
GNNMQ
80
6
0
02 Aug 2023
MRQ:Support Multiple Quantization Schemes through Model Re-Quantization
MRQ:Support Multiple Quantization Schemes through Model Re-Quantization
Manasa Manohara
Sankalp Dayal
Tarqi Afzal
Rahul Bakshi
Kahkuen Fu
MQ
54
0
0
01 Aug 2023
Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights
  Generation
Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights Generation
Stylianos I. Venieris
Javier Fernandez-Marques
Nicholas D. Lane
MQ
65
3
0
25 Jul 2023
Adaptive ResNet Architecture for Distributed Inference in
  Resource-Constrained IoT Systems
Adaptive ResNet Architecture for Distributed Inference in Resource-Constrained IoT Systems
Fazeela Mazhar Khan
Emna Baccour
A. Erbad
Mounir Hamdi
41
2
0
21 Jul 2023
EMQ: Evolving Training-free Proxies for Automated Mixed Precision
  Quantization
EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization
Peijie Dong
Lujun Li
Zimian Wei
Xin-Yi Niu
Zhiliang Tian
H. Pan
MQ
81
31
0
20 Jul 2023
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
Vasileios Leon
Muhammad Abdullah Hanif
Giorgos Armeniakos
Xun Jiao
Mohamed Bennai
K. Pekmestzi
Dimitrios Soudris
104
3
0
20 Jul 2023
Previous
123...8910...242526
Next