Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.05877
Cited By
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
15 December 2017
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"
50 / 1,298 papers shown
Title
Model Compression Techniques in Biometrics Applications: A Survey
Eduarda Caldeira
Pedro C. Neto
Marco Huber
Naser Damer
Ana F. Sequeira
117
11
0
18 Jan 2024
Enabling On-device Continual Learning with Binary Neural Networks
Lorenzo Vorabbi
Davide Maltoni
Guido Borghi
Stefano Santi
MQ
109
6
0
18 Jan 2024
Efficient and Mathematically Robust Operations for Certified Neural Networks Inference
Fabien Geyer
Johannes Freitag
Tobias Schulz
Sascha Uhrig
90
1
0
16 Jan 2024
GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory Stitching
Cong Guo
Rui Zhang
Jiale Xu
Jingwen Leng
Zihan Liu
...
Minyi Guo
Hao Wu
Shouren Zhao
Junping Zhao
Ke Zhang
VLM
125
12
0
16 Jan 2024
Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning
Manish Sharma
Jamison Heard
Eli Saber
Panos P. Markopoulos
71
1
0
15 Jan 2024
CLSA-CIM: A Cross-Layer Scheduling Approach for Computing-in-Memory Architectures
Rebecca Pelke
José Cubero-Cascante
Nils Bosbach
Felix Staudigl
Rainer Leupers
Jan Moritz Joseph
53
0
0
15 Jan 2024
Memory-Efficient Fine-Tuning for Quantized Diffusion Model
Hyogon Ryu
Seohyun Lim
Hyunjung Shim
DiffM
MQ
55
6
0
09 Jan 2024
Safety and Performance, Why Not Both? Bi-Objective Optimized Model Compression against Heterogeneous Attacks Toward AI Software Deployment
Jie Zhu
Leye Wang
Xiao Han
Anmin Liu
Tao Xie
AAML
115
6
0
02 Jan 2024
A Reliable Knowledge Processing Framework for Combustion Science using Foundation Models
Vansh Sharma
Venkat Raman
52
7
0
31 Dec 2023
Adaptive Depth Networks with Skippable Sub-Paths
Woochul Kang
67
1
0
27 Dec 2023
Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization
K. Balaskas
Andreas Karatzas
Christos Sad
K. Siozios
Iraklis Anagnostopoulos
Georgios Zervakis
Jörg Henkel
MQ
72
11
0
23 Dec 2023
Towards Efficient Verification of Quantized Neural Networks
Pei Huang
Haoze Wu
Yuting Yang
Ieva Daukantas
Min Wu
Yedi Zhang
Clark W. Barrett
MQ
78
12
0
20 Dec 2023
Optimizing Convolutional Neural Network Architecture
Luis Balderas
Miguel Lastra
José M. Benítez
CVBM
101
7
0
17 Dec 2023
Post-Training Quantization for Re-parameterization via Coarse & Fine Weight Splitting
Dawei Yang
Ning He
Xing Hu
Zhihang Yuan
Jiangyong Yu
Chen Xu
Zhe Jiang
MQ
90
7
0
17 Dec 2023
USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Shaojin Ding
David Qiu
David Rim
Yanzhang He
Oleg Rybakov
...
Tara N. Sainath
Zhonglin Han
Jian Li
Amir Yazdanbakhsh
Shivani Agrawal
MQ
105
12
0
13 Dec 2023
CBQ: Cross-Block Quantization for Large Language Models
Xin Ding
Xiaoyu Liu
Zhijun Tu
Yun-feng Zhang
Wei Li
...
Hanting Chen
Yehui Tang
Zhiwei Xiong
Baoqun Yin
Yunhe Wang
MQ
146
17
0
13 Dec 2023
FP8-BERT: Post-Training Quantization for Transformer
Jianwei Li
Tianchi Zhang
Ian En-Hsu Yen
Dongkuan Xu
MQ
57
5
0
10 Dec 2023
Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge
Xuan Shen
Zhaoyang Han
Lei Lu
Zhenglun Kong
Zhengang Li
Ming Lin
Chao Wu
Yanzhi Wang
MQ
122
30
0
09 Dec 2023
SmoothQuant+: Accurate and Efficient 4-bit Post-Training WeightQuantization for LLM
Jiayi Pan
Chengcan Wang
Kaifu Zheng
Yangguang Li
Zhenyu Wang
Bin Feng
MQ
76
7
0
06 Dec 2023
MoEC: Mixture of Experts Implicit Neural Compression
Jianchen Zhao
Cheng-Ching Tseng
Ming Lu
Ruichuan An
Xiaobao Wei
He Sun
Shanghang Zhang
81
3
0
03 Dec 2023
Mixed-Precision Quantization for Federated Learning on Resource-Constrained Heterogeneous Devices
Huancheng Chen
H. Vikalo
FedML
MQ
120
7
0
29 Nov 2023
Mirage: An RNS-Based Photonic Accelerator for DNN Training
Cansu Demirkıran
Guowei Yang
D. Bunandar
Ajay Joshi
71
3
0
29 Nov 2023
LayerCollapse: Adaptive compression of neural networks
Soheil Zibakhsh Shabgahi
Mohammad Soheil Shariff
F. Koushanfar
AI4CE
61
1
0
29 Nov 2023
PIPE : Parallelized Inference Through Post-Training Quantization Ensembling of Residual Expansions
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
MQ
102
0
0
27 Nov 2023
TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
Yushi Huang
Ruihao Gong
Jing Liu
Tianlong Chen
Xianglong Liu
DiffM
MQ
111
41
0
27 Nov 2023
Fast Inner-Product Algorithms and Architectures for Deep Neural Network Accelerators
Trevor E. Pogue
N. Nicolici
56
3
0
20 Nov 2023
Efficient Neural Networks for Tiny Machine Learning: A Comprehensive Review
M. Lê
Pierre Wolinski
Julyan Arbel
89
10
0
20 Nov 2023
LifeLearner: Hardware-Aware Meta Continual Learning System for Embedded Computing Platforms
Young D. Kwon
Jagmohan Chauhan
Hong Jia
Stylianos I. Venieris
Cecilia Mascolo
89
12
0
19 Nov 2023
Low-Precision Floating-Point for Efficient On-Board Deep Neural Network Processing
Cédric Gernigon
Silviu-Ioan Filip
Olivier Sentieys
Clément Coggiola
Mickael Bruno
MQ
51
8
0
18 Nov 2023
LightBTSeg: A lightweight breast tumor segmentation model using ultrasound images via dual-path joint knowledge distillation
Hongjiang Guo
Shengwen Wang
Hao Dang
Kangle Xiao
Yaru Yang
Wenpei Liu
Tongtong Liu
Yiying Wan
72
2
0
18 Nov 2023
Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis
Simon Niedermayr
Josef Stumpfegger
Rüdiger Westermann
3DGS
105
122
0
17 Nov 2023
Quantized Distillation: Optimizing Driver Activity Recognition Models for Resource-Constrained Environments
Calvin Tanama
Kunyu Peng
Zdravko Marinov
Rainer Stiefelhagen
Alina Roitberg
65
1
0
10 Nov 2023
Exploiting Neural-Network Statistics for Low-Power DNN Inference
Lennart Bamberg
Ardalan Najafi
Alberto García-Ortiz
29
1
0
09 Nov 2023
RepQ: Generalizing Quantization-Aware Training for Re-Parametrized Architectures
Anastasiia Prutianova
Alexey Zaytsev
Chung-Kuei Lee
Fengyu Sun
Ivan Koryakovskiy
MQ
66
0
0
09 Nov 2023
Reducing the Side-Effects of Oscillations in Training of Quantized YOLO Networks
Kartik Gupta
Akshay Asthana
MQ
38
8
0
09 Nov 2023
Beyond Size: How Gradients Shape Pruning Decisions in Large Language Models
Rocktim Jyoti Das
Mingjie Sun
Liqun Ma
Zhiqiang Shen
VLM
79
18
0
08 Nov 2023
A Lightweight Architecture for Real-Time Neuronal-Spike Classification
Muhammad Ali Siddiqi
David Vrijenhoek
Lennart P L Landsmeer
Job van der Kleij
A. Gebregiorgis
V. Romano
R. Bishnoi
Said Hamdioui
Christos Strydis
36
1
0
08 Nov 2023
AFPQ: Asymmetric Floating Point Quantization for LLMs
Yijia Zhang
Sicheng Zhang
Shijie Cao
Dayou Du
Jianyu Wei
Ting Cao
Ningyi Xu
MQ
58
5
0
03 Nov 2023
Effective Quantization for Diffusion Models on CPUs
Hanwen Chang
Haihao Shen
Yiyang Cai
Xinyu. Ye
Zhenzhong Xu
Wenhua Cheng
Kaokao Lv
Weiwei Zhang
Yintong Lu
Heng Guo
MQ
77
7
0
02 Nov 2023
Fully Quantized Always-on Face Detector Considering Mobile Image Sensors
Haechang Lee
Wongi Jeong
Dongil Ryu
Hyunwoo Je
Albert No
Kijeong Kim
Se Young Chun
CVBM
59
0
0
02 Nov 2023
Efficient LLM Inference on CPUs
Haihao Shen
Hanwen Chang
Bo Dong
Yu Luo
Hengyu Meng
MQ
68
19
0
01 Nov 2023
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
VLM
100
64
0
01 Nov 2023
Compression with Exact Error Distribution for Federated Learning
Mahmoud Hegazy
Rémi Leluc
Cheuk Ting Li
Hadrien Hendrikx
FedML
63
11
0
31 Oct 2023
FlexTrain: A Dynamic Training Framework for Heterogeneous Devices Environments
Mert Unsal
Ali Maatouk
Antonio De Domenico
Nicola Piovesan
Fadhel Ayed
30
0
0
31 Oct 2023
Efficient IoT Inference via Context-Awareness
Mohammad Mehdi Rastikerdar
Jin Huang
Shiwei Fang
Hui Guan
Deepak Ganesan
103
0
0
29 Oct 2023
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Yilong Zhao
Chien-Yu Lin
Kan Zhu
Zihao Ye
Lequn Chen
Wenlei Bao
Luis Ceze
Arvind Krishnamurthy
Tianqi Chen
Baris Kasikci
MQ
147
150
0
29 Oct 2023
Efficient Object Detection in Optical Remote Sensing Imagery via Attention-based Feature Distillation
Pourya Shamsolmoali
Jocelyn Chanussot
Huiyu Zhou
Yue Lu
119
5
0
28 Oct 2023
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Zichang Liu
Jue Wang
Tri Dao
Dinesh Manocha
Binhang Yuan
...
Anshumali Shrivastava
Ce Zhang
Yuandong Tian
Christopher Ré
Beidi Chen
BDL
123
221
0
26 Oct 2023
VMAF Re-implementation on PyTorch: Some Experimental Results
Kirill Aistov
Maxim Koroteev
146
2
0
24 Oct 2023
Effortless Cross-Platform Video Codec: A Codebook-Based Method
Kuan Tian
Yonghang Guan
Jin-Peng Xiang
Jun Zhang
Xiao Han
Wei Yang
74
1
0
16 Oct 2023
Previous
1
2
3
...
7
8
9
...
24
25
26
Next