ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.05877
  4. Cited By
Quantization and Training of Neural Networks for Efficient
  Integer-Arithmetic-Only Inference

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

15 December 2017
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
    MQ
ArXiv (abs)PDFHTML

Papers citing "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"

50 / 1,298 papers shown
Title
Clinical Insights: A Comprehensive Review of Language Models in Medicine
Clinical Insights: A Comprehensive Review of Language Models in Medicine
Nikita Neveditsin
Pawan Lingras
V. Mago
LM&MA
127
5
0
08 Jan 2025
Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning
Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning
Zhen Li
Yupeng Su
Runming Yang
C. Xie
Zehua Wang
Zhongwei Xie
Ngai Wong
Hongxia Yang
MQLRM
181
4
0
06 Jan 2025
Pruning-based Data Selection and Network Fusion for Efficient Deep Learning
Humaira Kousar
Hasnain Irshad Bhatti
Jaekyun Moon
96
1
0
03 Jan 2025
PTQ4VM: Post-Training Quantization for Visual Mamba
PTQ4VM: Post-Training Quantization for Visual Mamba
Jun-gyu Jin
Changhun Lee
Seonggon Kim
Eunhyeok Park
MQMamba
132
2
0
29 Dec 2024
Adopting Trustworthy AI for Sleep Disorder Prediction: Deep Time Series
  Analysis with Temporal Attention Mechanism and Counterfactual Explanations
Adopting Trustworthy AI for Sleep Disorder Prediction: Deep Time Series Analysis with Temporal Attention Mechanism and Counterfactual Explanations
Pegah Ahadian
Wei Xu
Sherry Wang
Qiang Guan
AI4TS
43
1
0
25 Dec 2024
CBNN: 3-Party Secure Framework for Customized Binary Neural Networks
  Inference
CBNN: 3-Party Secure Framework for Customized Binary Neural Networks Inference
Benchang Dong
Zhili Chen
Xin Chen
Shiwen Wei
Jie Fu
Huifa Li
110
1
0
21 Dec 2024
Improving Quantization-aware Training of Low-Precision Network via Block
  Replacement on Full-Precision Counterpart
Improving Quantization-aware Training of Low-Precision Network via Block Replacement on Full-Precision Counterpart
Chengting Yu
Shu Yang
Fengzhao Zhang
Hanzhi Ma
Aili Wang
Er-ping Li
MQ
113
4
0
20 Dec 2024
MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion
  Models
MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models
Weilun Feng
Haotong Qin
Chuanguang Yang
Zhulin An
Libo Huang
Boyu Diao
Fei Wang
Renshuai Tao
Yongjun Xu
Michele Magno
DiffMMQ
142
5
0
16 Dec 2024
Efficient Quantization-Aware Training on Segment Anything Model in
  Medical Images and Its Deployment
Efficient Quantization-Aware Training on Segment Anything Model in Medical Images and Its Deployment
Haisheng Lu
Yujie Fu
Fan Zhang
Le Zhang
MedImMQ
127
0
0
15 Dec 2024
DQA: An Efficient Method for Deep Quantization of Deep Neural Network
  Activations
DQA: An Efficient Method for Deep Quantization of Deep Neural Network Activations
Wenhao Hu
Paul Henderson
José Cano
MQ
85
0
0
12 Dec 2024
Optimising TinyML with Quantization and Distillation of Transformer and
  Mamba Models for Indoor Localisation on Edge Devices
Optimising TinyML with Quantization and Distillation of Transformer and Mamba Models for Indoor Localisation on Edge Devices
Thanaphon Suwannaphong
Ferdian Jovan
I. Craddock
Ryan McConville
Mamba
118
3
0
12 Dec 2024
MOFHEI: Model Optimizing Framework for Fast and Efficient
  Homomorphically Encrypted Neural Network Inference
MOFHEI: Model Optimizing Framework for Fast and Efficient Homomorphically Encrypted Neural Network Inference
Parsa Ghazvinian
Robert Podschwadt
Prajwal Panzade
Mohammad H. Rafiei
Daniel Takabi
109
0
0
10 Dec 2024
DEX: Data Channel Extension for Efficient CNN Inference on Tiny AI
  Accelerators
DEX: Data Channel Extension for Efficient CNN Inference on Tiny AI Accelerators
Taesik Gong
F. Kawsar
Chulhong Min
119
3
0
09 Dec 2024
SKIM: Any-bit Quantization Pushing The Limits of Post-Training
  Quantization
SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization
Runsheng Bai
Qiang Liu
B. Liu
MQ
135
2
0
05 Dec 2024
Designing DNNs for a trade-off between robustness and processing
  performance in embedded devices
Designing DNNs for a trade-off between robustness and processing performance in embedded devices
Jon Gutiérrez-Zaballa
Koldo Basterretxea
Javier Echanobe
AAML
155
2
0
04 Dec 2024
CPTQuant -- A Novel Mixed Precision Post-Training Quantization
  Techniques for Large Language Models
CPTQuant -- A Novel Mixed Precision Post-Training Quantization Techniques for Large Language Models
Amitash Nanda
Sree Bhargavi Balija
D. Sahoo
MQ
106
0
0
03 Dec 2024
Behavior Backdoor for Deep Learning Models
Behavior Backdoor for Deep Learning Models
Jinqiao Wang
Pengfei Zhang
R. Tao
Jian Yang
Hao Liu
Xianglong Liu
Y. X. Wei
Yao Zhao
AAML
124
0
0
02 Dec 2024
Quantization-Aware Imitation-Learning for Resource-Efficient Robotic
  Control
Quantization-Aware Imitation-Learning for Resource-Efficient Robotic Control
Seongmin Park
Hyungmin Kim
Wonseok Jeon
Juyoung Yang
Byeongwook Jeon
Yoonseon Oh
Jungwook Choi
142
3
0
02 Dec 2024
On-chip Hyperspectral Image Segmentation with Fully Convolutional
  Networks for Scene Understanding in Autonomous Driving
On-chip Hyperspectral Image Segmentation with Fully Convolutional Networks for Scene Understanding in Autonomous Driving
Jon Gutiérrez-Zaballa
Koldo Basterretxea
Javier Echanobe
M. Victoria Martínez
Unai Martínez-Corral
Óscar Mata-Carballeira
Inés del Campo
140
19
0
28 Nov 2024
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for
  Quantized LLMs with 100T Training Tokens
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
Xu Ouyang
Tao Ge
Thomas Hartvigsen
Zhisong Zhang
Haitao Mi
Dong Yu
MQ
151
5
0
26 Nov 2024
Rapid Deployment of Domain-specific Hyperspectral Image Processors with
  Application to Autonomous Driving
Rapid Deployment of Domain-specific Hyperspectral Image Processors with Application to Autonomous Driving
Jon Gutiérrez-Zaballa
Koldo Basterretxea
Javier Echanobe
Óscar Mata-Carballeira
M. Victoria Martínez
129
3
0
26 Nov 2024
LiteVAR: Compressing Visual Autoregressive Modelling with Efficient
  Attention and Quantization
LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization
Rui Xie
Tianchen Zhao
Zhihang Yuan
Rui Wan
Wenxi Gao
Zhenhua Zhu
Xuefei Ning
Yu Wang
VGenMQ
95
4
0
26 Nov 2024
PassionSR: Post-Training Quantization with Adaptive Scale in One-Step
  Diffusion based Image Super-Resolution
PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution
Libo Zhu
Jiajian Li
Haotong Qin
Wenbo Li
Yulun Zhang
Yong Guo
Xiaokang Yang
DiffMMQ
115
3
0
26 Nov 2024
Efficient Ternary Weight Embedding Model: Bridging Scalability and
  Performance
Efficient Ternary Weight Embedding Model: Bridging Scalability and Performance
Jiayi Chen
Chen Wu
Shanghang Zhang
Nan Li
Lefei Zhang
Qi Zhang
115
0
0
23 Nov 2024
Ex Uno Pluria: Insights on Ensembling in Low Precision Number Systems
Ex Uno Pluria: Insights on Ensembling in Low Precision Number Systems
G. Nam
Juho Lee
135
0
0
22 Nov 2024
Exploring the Robustness and Transferability of Patch-Based Adversarial Attacks in Quantized Neural Networks
Exploring the Robustness and Transferability of Patch-Based Adversarial Attacks in Quantized Neural Networks
Amira Guesmi
B. Ouni
Mohamed Bennai
AAML
153
0
0
22 Nov 2024
Quantized symbolic time series approximation
Quantized symbolic time series approximation
Erin Carson
Xinye Chen
Cheng Kang
AI4TS
169
0
0
20 Nov 2024
Understanding Student Sentiment on Mental Health Support in Colleges
  Using Large Language Models
Understanding Student Sentiment on Mental Health Support in Colleges Using Large Language Models
Palak Sood
Chengyang He
Divyanshu Gupta
Yue Ning
Ping Wang
AI4MH
105
1
0
18 Nov 2024
Towards Accurate and Efficient Sub-8-Bit Integer Training
Wenjin Guo
Donglai Liu
Weiying Xie
Yunsong Li
Xuefei Ning
Zihan Meng
Shulin Zeng
Jie Lei
Zhenman Fang
Yu Wang
MQ
68
1
0
17 Nov 2024
Steam Turbine Anomaly Detection: An Unsupervised Learning Approach Using Enhanced Long Short-Term Memory Variational Autoencoder
Weiming Xu
Peng Zhang
51
0
0
16 Nov 2024
Bag of Design Choices for Inference of High-Resolution Masked Generative Transformer
Shitong Shao
Zikai Zhou
Tian Ye
Lichen Bai
Zhiqiang Xu
Zeke Xie
DiffM
123
0
0
16 Nov 2024
CULL-MT: Compression Using Language and Layer pruning for Machine
  Translation
CULL-MT: Compression Using Language and Layer pruning for Machine Translation
Pedram Rostami
M. Dousti
100
1
0
10 Nov 2024
Building an Efficient Multilingual Non-Profit IR System for the Islamic
  Domain Leveraging Multiprocessing Design in Rust
Building an Efficient Multilingual Non-Profit IR System for the Islamic Domain Leveraging Multiprocessing Design in Rust
Vera Pavlova
Mohammed Makhlouf
73
1
0
09 Nov 2024
Optimizing Large Language Models through Quantization: A Comparative
  Analysis of PTQ and QAT Techniques
Optimizing Large Language Models through Quantization: A Comparative Analysis of PTQ and QAT Techniques
Jahid Hasan
MQ
121
1
0
09 Nov 2024
Towards Multi-Modal Mastery: A 4.5B Parameter Truly Multi-Modal Small
  Language Model
Towards Multi-Modal Mastery: A 4.5B Parameter Truly Multi-Modal Small Language Model
Ben Koska
Mojmír Horváth
MoE
64
1
0
08 Nov 2024
Fine-Grained Reward Optimization for Machine Translation using Error Severity Mappings
Fine-Grained Reward Optimization for Machine Translation using Error Severity Mappings
Miguel Moura Ramos
Tomás Almeida
Daniel Vareta
Filipe Azevedo
Sweta Agrawal
Patrick Fernandes
André F. T. Martins
123
4
0
08 Nov 2024
Poor Man's Training on MCUs: A Memory-Efficient Quantized
  Back-Propagation-Free Approach
Poor Man's Training on MCUs: A Memory-Efficient Quantized Back-Propagation-Free Approach
Yequan Zhao
Hai Li
Ian Young
Zheng Zhang
MQ
97
3
0
07 Nov 2024
Scaling Laws for Precision
Scaling Laws for Precision
Tanishq Kumar
Zachary Ankner
Benjamin Spector
Blake Bordelon
Niklas Muennighoff
Mansheej Paul
Cengiz Pehlevan
Christopher Ré
Aditi Raghunathan
AIFinMoMe
106
29
0
07 Nov 2024
Decoupling Dark Knowledge via Block-wise Logit Distillation for
  Feature-level Alignment
Decoupling Dark Knowledge via Block-wise Logit Distillation for Feature-level Alignment
Chengting Yu
Fengzhao Zhang
Ruizhe Chen
Zuozhu Liu
Shurun Tan
Er-ping Li
Aili Wang
96
2
0
03 Nov 2024
BF-IMNA: A Bit Fluid In-Memory Neural Architecture for Neural Network
  Acceleration
BF-IMNA: A Bit Fluid In-Memory Neural Architecture for Neural Network Acceleration
M. Rakka
Rachid Karami
A. Eltawil
M. Fouda
Fadi J. Kurdahi
MQ
82
1
0
03 Nov 2024
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Bram Adams
Ahmed E. Hassan
VLM
144
0
0
01 Nov 2024
Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using
  Discrete State Space Diffusion Model
Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model
Wenjia Xie
Hao Wang
Lefei Zhang
Rui Zhou
Defu Lian
Enhong Chen
DiffM
130
4
0
31 Oct 2024
Neural Model Checking
Neural Model Checking
Mirco Giacobbe
Daniel Kroening
Abhinandan Pal
Michael Tautschnig
NAI
71
2
0
31 Oct 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
484
0
0
29 Oct 2024
IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models
IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models
Hang Guo
Yawei Li
Tao Dai
Shu-Tao Xia
Luca Benini
MQ
133
2
0
29 Oct 2024
SparseTem: Boosting the Efficiency of CNN-Based Video Encoders by
  Exploiting Temporal Continuity
SparseTem: Boosting the Efficiency of CNN-Based Video Encoders by Exploiting Temporal Continuity
Kaidi Wang
Jieru Zhao
Shuo Yang
Wenchao Ding
Minyi Guo
58
0
0
28 Oct 2024
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
Sangmin Bae
Adam Fisch
Hrayr Harutyunyan
Ziwei Ji
Seungyeon Kim
Tal Schuster
KELM
137
7
0
28 Oct 2024
Improving Small-Scale Large Language Models Function Calling for
  Reasoning Tasks
Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks
Graziano A. Manduzio
Federico A. Galatolo
M. G. Cimino
Enzo Pasquale Scilingo
Lorenzo Cominelli
LRM
38
1
0
24 Oct 2024
Quantum Large Language Models via Tensor Network Disentanglers
Quantum Large Language Models via Tensor Network Disentanglers
Borja Aizpurua
S. Jahromi
Sukhbinder Singh
Roman Orus
78
5
0
22 Oct 2024
Remote Timing Attacks on Efficient Language Model Inference
Remote Timing Attacks on Efficient Language Model Inference
Nicholas Carlini
Milad Nasr
75
3
0
22 Oct 2024
Previous
123456...242526
Next