ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.06822
  4. Cited By
Low-bit Quantization of Neural Networks for Efficient Inference

Low-bit Quantization of Neural Networks for Efficient Inference

18 February 2019
Yoni Choukroun
Eli Kravchik
Fan Yang
P. Kisilev
    MQ
ArXivPDFHTML

Papers citing "Low-bit Quantization of Neural Networks for Efficient Inference"

50 / 182 papers shown
Title
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design
Haojie Duanmu
Xiuhong Li
Zhihang Yuan
Size Zheng
Jiangfei Duan
Xingcheng Zhang
Dahua Lin
MQ
MoE
174
0
0
09 May 2025
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Navin Ranjan
Andreas E. Savakis
MQ
VLM
68
0
0
08 May 2025
QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge
QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge
Xuan Shen
Weize Ma
Jing Liu
Changdi Yang
Rui Ding
...
Wei Niu
Yanzhi Wang
Pu Zhao
Jun Lin
Jiuxiang Gu
MQ
57
0
0
20 Mar 2025
MergeQuant: Accurate 4-bit Static Quantization of Large Language Models by Channel-wise Calibration
Jinguang Wang
Yufei Guo
Haifeng Sun
Tingting Yang
Zirui Zhuang
Wanyi Ning
Yuexi Yin
Q. Qi
Jianxin Liao
MQ
MoMe
51
0
0
07 Mar 2025
QArtSR: Quantization via Reverse-Module and Timestep-Retraining in One-Step Diffusion based Image Super-Resolution
Libo Zhu
Haotong Qin
Kaicheng Yang
W. J. Li
Yong Guo
Yulun Zhang
Susanto Rahardja
Xiaokang Yang
MQ
DiffM
66
0
0
07 Mar 2025
Split Adaptation for Pre-trained Vision Transformers
Lixu Wang
Bingqi Shang
Y. Li
Payal Mohapatra
Wei Dong
Xiao-Xu Wang
Qi Zhu
ViT
43
0
0
01 Mar 2025
CondiQuant: Condition Number Based Low-Bit Quantization for Image Super-Resolution
CondiQuant: Condition Number Based Low-Bit Quantization for Image Super-Resolution
Kai Liu
Dehui Wang
Zhiteng Li
Zheng Chen
Yong Guo
W. J. Li
L. Kong
Yulun Zhang
MQ
73
1
0
24 Feb 2025
QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models
QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models
Jiajun Zhou
Yifan Yang
Kai Zhen
Z. Liu
Yequan Zhao
Ershad Banijamali
Athanasios Mouchtaris
Ngai Wong
Zheng Zhang
MQ
41
0
0
17 Feb 2025
UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model
UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model
Branislava Jankovic
Sabina Jangirova
Waseem Ullah
Latif U. Khan
Mohsen Guizani
31
0
0
21 Jan 2025
Improving Quantization-aware Training of Low-Precision Network via Block
  Replacement on Full-Precision Counterpart
Improving Quantization-aware Training of Low-Precision Network via Block Replacement on Full-Precision Counterpart
Chengting Yu
Shu Yang
Fengzhao Zhang
Hanzhi Ma
Aili Wang
Er-ping Li
MQ
81
2
0
20 Dec 2024
PassionSR: Post-Training Quantization with Adaptive Scale in One-Step
  Diffusion based Image Super-Resolution
PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution
Libo Zhu
Jiajian Li
Haotong Qin
W. J. Li
Yulun Zhang
Yong Guo
Xiaokang Yang
DiffM
MQ
72
2
0
26 Nov 2024
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Bram Adams
Ahmed E. Hassan
VLM
43
0
0
01 Nov 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
163
0
0
29 Oct 2024
Content-Aware Radiance Fields: Aligning Model Complexity with Scene
  Intricacy Through Learned Bitwidth Quantization
Content-Aware Radiance Fields: Aligning Model Complexity with Scene Intricacy Through Learned Bitwidth Quantization
Wei Liu
Xue Xian Zheng
Jingyi Yu
Xin Lou
MQ
34
0
0
25 Oct 2024
Error Diffusion: Post Training Quantization with Block-Scaled Number
  Formats for Neural Networks
Error Diffusion: Post Training Quantization with Block-Scaled Number Formats for Neural Networks
Alireza Khodamoradi
K. Denolf
Eric Dellinger
MQ
39
0
0
15 Oct 2024
Q-VLM: Post-training Quantization for Large Vision-Language Models
Q-VLM: Post-training Quantization for Large Vision-Language Models
Changyuan Wang
Ziwei Wang
Xiuwei Xu
Yansong Tang
Jie Zhou
Jiwen Lu
MQ
32
1
0
10 Oct 2024
Accelerating Error Correction Code Transformers
Accelerating Error Correction Code Transformers
Matan Levy
Yoni Choukroun
Lior Wolf
MQ
23
0
0
08 Oct 2024
P4Q: Learning to Prompt for Quantization in Visual-language Models
P4Q: Learning to Prompt for Quantization in Visual-language Models
H. Sun
Runqi Wang
Yanjing Li
Xianbin Cao
Xiaolong Jiang
Yao Hu
Baochang Zhang
MQ
VLM
44
0
0
26 Sep 2024
PTQ4RIS: Post-Training Quantization for Referring Image Segmentation
PTQ4RIS: Post-Training Quantization for Referring Image Segmentation
Xiaoyan Jiang
Hang Yang
Kaiying Zhu
Xihe Qiu
Shibo Zhao
Sifan Zhou
MQ
26
0
0
25 Sep 2024
Art and Science of Quantizing Large-Scale Models: A Comprehensive
  Overview
Art and Science of Quantizing Large-Scale Models: A Comprehensive Overview
Yanshu Wang
Tong Yang
Xiyan Liang
Guoan Wang
Hanning Lu
Xu Zhe
Yaoming Li
Li Weitao
MQ
42
3
0
18 Sep 2024
Dynamic Range Reduction via Branch-and-Bound
Dynamic Range Reduction via Branch-and-Bound
Thore Gerlach
Nico Piatkowski
18
0
0
17 Sep 2024
Infrared Domain Adaptation with Zero-Shot Quantization
Infrared Domain Adaptation with Zero-Shot Quantization
Burak Sevsay
Erdem Akagündüz
VLM
MQ
30
1
0
25 Aug 2024
Low-Bitwidth Floating Point Quantization for Efficient High-Quality
  Diffusion Models
Low-Bitwidth Floating Point Quantization for Efficient High-Quality Diffusion Models
Cheng Chen
Christina Giannoula
Andreas Moshovos
DiffM
MQ
24
0
0
13 Aug 2024
Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization
Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization
Róisín Luo
Alexandru Drimbarean
Walsh Simon
Colm O'Riordan
MQ
37
0
0
01 Aug 2024
MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity
MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity
Kanghyun Choi
Hyeyoon Lee
Dain Kwon
Sunjong Park
Kyuyeun Kim
Noseong Park
Jinho Lee
Jinho Lee
MQ
48
1
0
29 Jul 2024
Temporal Feature Matters: A Framework for Diffusion Model Quantization
Temporal Feature Matters: A Framework for Diffusion Model Quantization
Yushi Huang
Ruihao Gong
Xianglong Liu
Jing Liu
Yuhang Li
Jiwen Lu
Dacheng Tao
DiffM
MQ
49
0
0
28 Jul 2024
Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of
  Learnable Binary Vectors
Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors
Matt Gorbett
Hossein Shirazi
Indrakshi Ray
MQ
43
0
0
16 Jul 2024
QVD: Post-training Quantization for Video Diffusion Models
QVD: Post-training Quantization for Video Diffusion Models
Shilong Tian
Hong Chen
Chengtao Lv
Yu Liu
Jinyang Guo
Xianglong Liu
Shengxi Li
Hao Yang
Tao Xie
VGen
MQ
46
3
0
16 Jul 2024
Automated Justification Production for Claim Veracity in Fact Checking:
  A Survey on Architectures and Approaches
Automated Justification Production for Claim Veracity in Fact Checking: A Survey on Architectures and Approaches
Islam Eldifrawi
Shengrui Wang
Amine Trabelsi
46
8
0
09 Jul 2024
OutlierTune: Efficient Channel-Wise Quantization for Large Language
  Models
OutlierTune: Efficient Channel-Wise Quantization for Large Language Models
Jinguang Wang
Yuexi Yin
Haifeng Sun
Qi Qi
Jingyu Wang
Zirui Zhuang
Tingting Yang
Jianxin Liao
43
2
0
27 Jun 2024
Low-Rank Quantization-Aware Training for LLMs
Low-Rank Quantization-Aware Training for LLMs
Yelysei Bondarenko
Riccardo Del Chiaro
Markus Nagel
MQ
33
10
0
10 Jun 2024
2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution
2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution
Kai Liu
Haotong Qin
Yong Guo
Xin Yuan
Linghe Kong
Guihai Chen
Yulun Zhang
MQ
35
5
0
10 Jun 2024
P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for
  Fully Quantized Vision Transformer
P2^22-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer
Huihong Shi
Xin Cheng
Wendong Mao
Zhongfeng Wang
MQ
48
3
0
30 May 2024
Extreme Compression of Adaptive Neural Images
Extreme Compression of Adaptive Neural Images
Leo Hoshikawa
Marcos V. Conde
Takeshi Ohashi
Atsushi Irie
48
1
0
27 May 2024
Nearest is Not Dearest: Towards Practical Defense against
  Quantization-conditioned Backdoor Attacks
Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks
Boheng Li
Yishuo Cai
Haowei Li
Feng Xue
Zhifeng Li
Yiming Li
MQ
AAML
35
20
0
21 May 2024
Selective Focus: Investigating Semantics Sensitivity in Post-training
  Quantization for Lane Detection
Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection
Yunqian Fan
Xiuying Wei
Ruihao Gong
Yuqing Ma
Xiangguo Zhang
Qi Zhang
Xianglong Liu
MQ
40
2
0
10 May 2024
Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free
  Efficient Vision Transformer
Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
Huihong Shi
Haikuo Shao
Wendong Mao
Zhongfeng Wang
ViT
MQ
44
3
0
06 May 2024
PTQ4SAM: Post-Training Quantization for Segment Anything
PTQ4SAM: Post-Training Quantization for Segment Anything
Chengtao Lv
Hong Chen
Jinyang Guo
Yifu Ding
Xianglong Liu
VLM
MQ
31
13
0
06 May 2024
Model Quantization and Hardware Acceleration for Vision Transformers: A
  Comprehensive Survey
Model Quantization and Hardware Acceleration for Vision Transformers: A Comprehensive Survey
Dayou Du
Gu Gong
Xiaowen Chu
MQ
38
7
0
01 May 2024
AdaBM: On-the-Fly Adaptive Bit Mapping for Image Super-Resolution
AdaBM: On-the-Fly Adaptive Bit Mapping for Image Super-Resolution
Chee Hong
Kyoung Mu Lee
SupR
MQ
26
2
0
04 Apr 2024
Minimize Quantization Output Error with Bias Compensation
Minimize Quantization Output Error with Bias Compensation
Cheng Gong
Haoshuai Zheng
Mengting Hu
Zheng Lin
Deng-Ping Fan
Yuzhi Zhang
Tao Li
MQ
38
2
0
02 Apr 2024
PikeLPN: Mitigating Overlooked Inefficiencies of Low-Precision Neural
  Networks
PikeLPN: Mitigating Overlooked Inefficiencies of Low-Precision Neural Networks
Marina Neseem
Conor McCullough
Randy Hsin
Chas Leichner
Shan Li
...
Andrew G. Howard
Lukasz Lew
Sherief Reda
Ville Rautio
Daniele Moro
MQ
47
0
0
29 Mar 2024
On the Impact of Black-box Deployment Strategies for Edge AI on Latency and Model Performance
On the Impact of Black-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Emad Fallahzadeh
Bram Adams
Ahmed E. Hassan
MQ
40
3
0
25 Mar 2024
QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven
  Fine Tuning
QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine Tuning
Jiun-Man Chen
Yu-Hsuan Chao
Yu-Jie Wang
Ming-Der Shieh
Chih-Chung Hsu
Wei-Fen Lin
MQ
37
1
0
11 Mar 2024
Training Machine Learning models at the Edge: A Survey
Training Machine Learning models at the Edge: A Survey
Aymen Rayane Khouas
Mohamed Reda Bouadjenek
Hakim Hacid
Sunil Aryal
29
10
0
05 Mar 2024
Towards Accurate Post-training Quantization for Reparameterized Models
Towards Accurate Post-training Quantization for Reparameterized Models
Luoming Zhang
Yefei He
Wen Fei
Zhenyu Lou
Weijia Wu
YangWei Ying
Hong Zhou
MQ
43
0
0
25 Feb 2024
Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward
Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward
Arnav Chavan
Raghav Magazine
Shubham Kushwaha
M. Debbah
Deepak Gupta
16
18
0
02 Feb 2024
Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation
Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation
Ruiping Liu
Jiaming Zhang
Kunyu Peng
Yufan Chen
Ke Cao
Junwei Zheng
M. Sarfraz
Kailun Yang
Rainer Stiefelhagen
VLM
42
8
0
30 Jan 2024
LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object
  Detection
LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection
Sifan Zhou
Liang Li
Xinyu Zhang
Bo-Wen Zhang
Shipeng Bai
Miao Sun
Ziyu Zhao
Xiaobo Lu
Xiangxiang Chu
MQ
41
12
0
29 Jan 2024
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
Chu Myaet Thwal
Minh N. H. Nguyen
Ye Lin Tun
Seongjin Kim
My T. Thai
Choong Seon Hong
61
5
0
22 Jan 2024
1234
Next