ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.05877
  4. Cited By
Quantization and Training of Neural Networks for Efficient
  Integer-Arithmetic-Only Inference

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

15 December 2017
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
    MQ
ArXivPDFHTML

Papers citing "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"

50 / 1,258 papers shown
Title
Compression with Exact Error Distribution for Federated Learning
Compression with Exact Error Distribution for Federated Learning
Mahmoud Hegazy
Rémi Leluc
Cheuk Ting Li
Aymeric Dieuleveut
FedML
18
9
0
31 Oct 2023
FlexTrain: A Dynamic Training Framework for Heterogeneous Devices
  Environments
FlexTrain: A Dynamic Training Framework for Heterogeneous Devices Environments
Mert Unsal
Ali Maatouk
Antonio De Domenico
Nicola Piovesan
Fadhel Ayed
16
0
0
31 Oct 2023
Efficient IoT Inference via Context-Awareness
Efficient IoT Inference via Context-Awareness
Mohammad Mehdi Rastikerdar
Jin Huang
Shiwei Fang
Hui Guan
Deepak Ganesan
43
0
0
29 Oct 2023
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Yilong Zhao
Chien-Yu Lin
Kan Zhu
Zihao Ye
Lequn Chen
Wenlei Bao
Luis Ceze
Arvind Krishnamurthy
Tianqi Chen
Baris Kasikci
MQ
28
133
0
29 Oct 2023
Efficient Object Detection in Optical Remote Sensing Imagery via
  Attention-based Feature Distillation
Efficient Object Detection in Optical Remote Sensing Imagery via Attention-based Feature Distillation
Pourya Shamsolmoali
Jocelyn Chanussot
Huiyu Zhou
Yue Lu
42
5
0
28 Oct 2023
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Zichang Liu
Jue Wang
Tri Dao
Dinesh Manocha
Binhang Yuan
...
Anshumali Shrivastava
Ce Zhang
Yuandong Tian
Christopher Ré
Beidi Chen
BDL
25
192
0
26 Oct 2023
VMAF Re-implementation on PyTorch: Some Experimental Results
VMAF Re-implementation on PyTorch: Some Experimental Results
Kirill Aistov
Maxim Koroteev
41
1
0
24 Oct 2023
Effortless Cross-Platform Video Codec: A Codebook-Based Method
Effortless Cross-Platform Video Codec: A Codebook-Based Method
Kuan Tian
Yonghang Guan
Jin-Peng Xiang
Jun Zhang
Xiao Han
Wei Yang
36
1
0
16 Oct 2023
QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large
  Language Models
QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models
Jing Liu
Ruihao Gong
Xiuying Wei
Zhiwei Dong
Jianfei Cai
Bohan Zhuang
MQ
35
51
0
12 Oct 2023
AutoFHE: Automated Adaption of CNNs for Efficient Evaluation over FHE
AutoFHE: Automated Adaption of CNNs for Efficient Evaluation over FHE
Wei Ao
Vishnu Boddeti
AAML
33
18
0
12 Oct 2023
QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources
QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources
Zhikai Li
Xiaoxuan Liu
Banghua Zhu
Zhen Dong
Qingyi Gu
Kurt Keutzer
MQ
32
7
0
11 Oct 2023
Segmented Harmonic Loss: Handling Class-Imbalanced Multi-Label Clinical
  Data for Medical Coding with Large Language Models
Segmented Harmonic Loss: Handling Class-Imbalanced Multi-Label Clinical Data for Medical Coding with Large Language Models
Surjya Ray
Pratik Mehta
Hongen Zhang
Ada Chaman
Jian Wang
Chung-Jen Ho
Michael Chiou
T. Suleman
19
1
0
06 Oct 2023
Quantized Transformer Language Model Implementations on Edge Devices
Quantized Transformer Language Model Implementations on Edge Devices
Mohammad Wali Ur Rahman
Murad Mehrab Abrar
Hunter Gibbons Copening
Salim Hariri
Sicong Shao
Pratik Satam
Soheil Salehi
MQ
19
8
0
06 Oct 2023
EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit
  Diffusion Models
EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models
Yefei He
Jing Liu
Weijia Wu
Hong Zhou
Bohan Zhuang
DiffM
MQ
24
47
0
05 Oct 2023
A Study of Quantisation-aware Training on Time Series Transformer Models
  for Resource-constrained FPGAs
A Study of Quantisation-aware Training on Time Series Transformer Models for Resource-constrained FPGAs
Tianheng Ling
Chao Qian
Lukas Einhaus
Gregor Schiele
13
1
0
04 Oct 2023
NOLA: Compressing LoRA using Linear Combination of Random Basis
NOLA: Compressing LoRA using Linear Combination of Random Basis
Soroush Abbasi Koohpayegani
K. Navaneet
Parsa Nooralinejad
Soheil Kolouri
Hamed Pirsiavash
40
12
0
04 Oct 2023
The Inhibitor: ReLU and Addition-Based Attention for Efficient
  Transformers
The Inhibitor: ReLU and Addition-Based Attention for Efficient Transformers
Rickard Brannvall
24
0
0
03 Oct 2023
Subtractor-Based CNN Inference Accelerator
Subtractor-Based CNN Inference Accelerator
Victor Gao
Issam Hammad
K. El-Sankary
Jason Gu
10
0
0
02 Oct 2023
FedAIoT: A Federated Learning Benchmark for Artificial Intelligence of
  Things
FedAIoT: A Federated Learning Benchmark for Artificial Intelligence of Things
Samiul Alam
Tuo Zhang
Tiantian Feng
Hui Shen
Zhichao Cao
...
JeongGil Ko
Kiran Somasundaram
Shrikanth S. Narayanan
Salman Avestimehr
Mi Zhang
38
11
0
29 Sep 2023
MixQuant: Mixed Precision Quantization with a Bit-width Optimization
  Search
MixQuant: Mixed Precision Quantization with a Bit-width Optimization Search
Yichen Xie
Wei Le
MQ
24
4
0
29 Sep 2023
Benchmarking and In-depth Performance Study of Large Language Models on
  Habana Gaudi Processors
Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors
Chengming Zhang
Baixi Sun
Xiaodong Yu
Zhen Xie
Weijian Zheng
K. Iskra
Pete Beckman
Dingwen Tao
25
4
0
29 Sep 2023
Highly Efficient SNNs for High-speed Object Detection
Highly Efficient SNNs for High-speed Object Detection
Nemin Qiu
Zhiguo Li
Yuan Li
Chuang Zhu
24
0
0
27 Sep 2023
Efficient Post-training Quantization with FP8 Formats
Efficient Post-training Quantization with FP8 Formats
Haihao Shen
Naveen Mellempudi
Xin He
Q. Gao
Chang‐Bao Wang
Mengni Wang
MQ
23
19
0
26 Sep 2023
GHN-QAT: Training Graph Hypernetworks to Predict Quantization-Robust
  Parameters of Unseen Limited Precision Neural Networks
GHN-QAT: Training Graph Hypernetworks to Predict Quantization-Robust Parameters of Unseen Limited Precision Neural Networks
S. Yun
Alexander Wong
MQ
12
0
0
24 Sep 2023
Causal-DFQ: Causality Guided Data-free Network Quantization
Causal-DFQ: Causality Guided Data-free Network Quantization
Yuzhang Shang
Bingxin Xu
Gaowen Liu
Ramana Rao Kompella
Yan Yan
MQ
CML
26
8
0
24 Sep 2023
Probabilistic Weight Fixing: Large-scale training of neural network
  weight uncertainties for quantization
Probabilistic Weight Fixing: Large-scale training of neural network weight uncertainties for quantization
Christopher Subia-Waud
S. Dasmahapatra
UQCV
MQ
21
0
0
24 Sep 2023
Early-Exit with Class Exclusion for Efficient Inference of Neural
  Networks
Early-Exit with Class Exclusion for Efficient Inference of Neural Networks
Jing Wang
Bing Li
Grace Li Zhang
13
4
0
23 Sep 2023
A Machine Learning-oriented Survey on Tiny Machine Learning
A Machine Learning-oriented Survey on Tiny Machine Learning
Luigi Capogrosso
Federico Cunico
D. Cheng
Franco Fummi
Marco Cristani
SyDa
MU
32
34
0
21 Sep 2023
Towards Real-Time Neural Video Codec for Cross-Platform Application
  Using Calibration Information
Towards Real-Time Neural Video Codec for Cross-Platform Application Using Calibration Information
Kuan Tian
Yonghang Guan
Jin-Peng Xiang
Jun Zhang
Xiao Han
Wei Yang
40
7
0
20 Sep 2023
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
Ofir Gordon
H. Habi
Arnon Netzer
MQ
41
1
0
20 Sep 2023
SPFQ: A Stochastic Algorithm and Its Error Analysis for Neural Network
  Quantization
SPFQ: A Stochastic Algorithm and Its Error Analysis for Neural Network Quantization
Jinjie Zhang
Rayan Saab
22
0
0
20 Sep 2023
Logic Design of Neural Networks for High-Throughput and Low-Power
  Applications
Logic Design of Neural Networks for High-Throughput and Low-Power Applications
Kangwei Xu
Grace Li Zhang
Ulf Schlichtmann
Bing Li
30
3
0
19 Sep 2023
Scaling Laws for Sparsely-Connected Foundation Models
Scaling Laws for Sparsely-Connected Foundation Models
Elias Frantar
C. Riquelme
N. Houlsby
Dan Alistarh
Utku Evci
35
36
0
15 Sep 2023
Accelerating Deep Neural Networks via Semi-Structured Activation
  Sparsity
Accelerating Deep Neural Networks via Semi-Structured Activation Sparsity
Matteo Grimaldi
Darshan C. Ganji
Ivan Lazarevich
Sudhakar Sah
14
10
0
12 Sep 2023
Real-Time Semantic Segmentation: A Brief Survey & Comparative Study in
  Remote Sensing
Real-Time Semantic Segmentation: A Brief Survey & Comparative Study in Remote Sensing
Clifford Broni-Bediako
Junshi Xia
Naoto Yokoya
51
10
0
12 Sep 2023
Quantized Non-Volatile Nanomagnetic Synapse based Autoencoder for
  Efficient Unsupervised Network Anomaly Detection
Quantized Non-Volatile Nanomagnetic Synapse based Autoencoder for Efficient Unsupervised Network Anomaly Detection
Muhammad Sabbir Alam
W. A. Misba
J. Atulasimha
12
1
0
12 Sep 2023
Soft Quantization using Entropic Regularization
Soft Quantization using Entropic Regularization
Rajmadan Lakshmanan
Alois Pichler
MQ
13
5
0
08 Sep 2023
CPU frequency scheduling of real-time applications on embedded devices
  with temporal encoding-based deep reinforcement learning
CPU frequency scheduling of real-time applications on embedded devices with temporal encoding-based deep reinforcement learning
Ti Zhou
Man Lin
11
4
0
07 Sep 2023
Memory Efficient Optimizers with 4-bit States
Memory Efficient Optimizers with 4-bit States
Bingrui Li
Jianfei Chen
Jun Zhu
MQ
30
34
0
04 Sep 2023
martFL: Enabling Utility-Driven Data Marketplace with a Robust and
  Verifiable Federated Learning Architecture
martFL: Enabling Utility-Driven Data Marketplace with a Robust and Verifiable Federated Learning Architecture
Qi Li
Zhuotao Liu
Qi Li
Ke Xu
22
12
0
03 Sep 2023
On-Device Learning with Binary Neural Networks
On-Device Learning with Binary Neural Networks
Lorenzo Vorabbi
Davide Maltoni
Stefano Santi
MQ
39
4
0
29 Aug 2023
Memory-aware Scheduling for Complex Wired Networks with Iterative Graph
  Optimization
Memory-aware Scheduling for Complex Wired Networks with Iterative Graph Optimization
Shuzhang Zhong
Meng Li
Yun Liang
Runsheng Wang
Ru Huang
GNN
11
3
0
26 Aug 2023
Homological Convolutional Neural Networks
Homological Convolutional Neural Networks
Antonio Briola
Yuanrong Wang
Silvia Bartolucci
T. Aste
LMTD
33
6
0
26 Aug 2023
A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance
A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance
Ian Colbert
Alessandro Pappalardo
Jakoba Petri-Koenig
MQ
24
9
0
25 Aug 2023
Jumping through Local Minima: Quantization in the Loss Landscape of
  Vision Transformers
Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
MQ
41
16
0
21 Aug 2023
QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D
  Object Detection
QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object Detection
Yifan Zhang
Zhen Dong
Huanrui Yang
Ming Lu
Cheng-Ching Tseng
Yuan Du
Kurt Keutzer
Li Du
Shanghang Zhang
MQ
34
9
0
21 Aug 2023
ResQ: Residual Quantization for Video Perception
ResQ: Residual Quantization for Video Perception
Davide Abati
H. Yahia
Markus Nagel
A. Habibian
MQ
23
2
0
18 Aug 2023
How Does Pruning Impact Long-Tailed Multi-Label Medical Image
  Classifiers?
How Does Pruning Impact Long-Tailed Multi-Label Medical Image Classifiers?
G. Holste
Ziyu Jiang
Ajay Jaiswal
Maria Hanna
Shlomo Minkowitz
...
Ying Ding
Ronald M. Summers
George Shih
Yifan Peng
Zhangyang Wang
26
1
0
17 Aug 2023
Unified Data-Free Compression: Pruning and Quantization without
  Fine-Tuning
Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning
Shipeng Bai
Jun Chen
Xintian Shen
Yixuan Qian
Yong Liu
MQ
24
12
0
14 Aug 2023
Efficient Neural PDE-Solvers using Quantization Aware Training
Efficient Neural PDE-Solvers using Quantization Aware Training
W.V.S.O. van den Dool
Tijmen Blankevoort
Max Welling
Yuki M. Asano
MQ
33
3
0
14 Aug 2023
Previous
123...789...242526
Next