ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.05426
  4. Cited By
BRECQ: Pushing the Limit of Post-Training Quantization by Block
  Reconstruction

BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction

10 February 2021
Yuhang Li
Ruihao Gong
Xu Tan
Yang Yang
Peng Hu
Qi Zhang
F. Yu
Wei Wang
Shi Gu
    MQ
ArXivPDFHTML

Papers citing "BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction"

50 / 113 papers shown
Title
Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models
Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models
Jung Hwan Heo
Jeonghoon Kim
Beomseok Kwon
Byeongwook Kim
Se Jung Kwon
Dongsoo Lee
MQ
43
9
0
27 Sep 2023
Jumping through Local Minima: Quantization in the Loss Landscape of
  Vision Transformers
Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
MQ
43
17
0
21 Aug 2023
NUPES : Non-Uniform Post-Training Quantization via Power Exponent Search
NUPES : Non-Uniform Post-Training Quantization via Power Exponent Search
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
MQ
24
6
0
10 Aug 2023
QuIP: 2-Bit Quantization of Large Language Models With Guarantees
QuIP: 2-Bit Quantization of Large Language Models With Guarantees
Jerry Chee
Yaohui Cai
Volodymyr Kuleshov
Chris De Sa
MQ
51
189
0
25 Jul 2023
Digital Modeling on Large Kernel Metamaterial Neural Network
Digital Modeling on Large Kernel Metamaterial Neural Network
Quan Liu
Hanyu Zheng
Brandon T. Swartz
Ho Hin Lee
Zuhayr Asad
I. Kravchenko
Jason G Valentine
Yuankai Huo
20
4
0
21 Jul 2023
InfLoR-SNN: Reducing Information Loss for Spiking Neural Networks
InfLoR-SNN: Reducing Information Loss for Spiking Neural Networks
Yu-Zhu Guo
Y. Chen
Liwen Zhang
Xiaode Liu
Xinyi Tong
Yuanyuan Ou
Xuhui Huang
Zhe Ma
AAML
43
3
0
10 Jul 2023
Squeezing Large-Scale Diffusion Models for Mobile
Squeezing Large-Scale Diffusion Models for Mobile
Jiwoong Choi
Minkyu Kim
Daehyun Ahn
Taesu Kim
Yulhwa Kim
Do-Hyun Jo
H. Jeon
Jae-Joon Kim
Hyungjun Kim
34
9
0
03 Jul 2023
Quantizable Transformers: Removing Outliers by Helping Attention Heads
  Do Nothing
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
23
87
0
22 Jun 2023
PTQD: Accurate Post-Training Quantization for Diffusion Models
PTQD: Accurate Post-Training Quantization for Diffusion Models
Yefei He
Luping Liu
Jing Liu
Weijia Wu
Hong Zhou
Bohan Zhuang
DiffM
MQ
32
103
0
18 May 2023
Patch-wise Mixed-Precision Quantization of Vision Transformer
Patch-wise Mixed-Precision Quantization of Vision Transformer
Junrui Xiao
Zhikai Li
Lianwei Yang
Qingyi Gu
MQ
32
12
0
11 May 2023
RPTQ: Reorder-based Post-training Quantization for Large Language Models
RPTQ: Reorder-based Post-training Quantization for Large Language Models
Zhihang Yuan
Lin Niu
Jia-Wen Liu
Wenyu Liu
Xinggang Wang
Yuzhang Shang
Guangyu Sun
Qiang Wu
Jiaxiang Wu
Bingzhe Wu
MQ
35
79
0
03 Apr 2023
A Unified Compression Framework for Efficient Speech-Driven Talking-Face
  Generation
A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation
Bo-Kyeong Kim
Jaemin Kang
Daeun Seo
Hancheol Park
Shinkook Choi
Hyoung-Kyu Song
Hyungshin Kim
Sungsu Lim
29
0
0
02 Apr 2023
FP8 versus INT8 for efficient deep learning inference
FP8 versus INT8 for efficient deep learning inference
M. V. Baalen
Andrey Kuzmin
Suparna S. Nair
Yuwei Ren
E. Mahurin
...
Sundar Subramanian
Sanghyuk Lee
Markus Nagel
Joseph B. Soriaga
Tijmen Blankevoort
MQ
31
45
0
31 Mar 2023
Hard Sample Matters a Lot in Zero-Shot Quantization
Hard Sample Matters a Lot in Zero-Shot Quantization
Huantong Li
Xiangmiao Wu
Fanbing Lv
Daihai Liao
Thomas H. Li
Yonggang Zhang
Bo Han
Mingkui Tan
MQ
24
20
0
24 Mar 2023
Solving Oscillation Problem in Post-Training Quantization Through a
  Theoretical Perspective
Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective
Yuexiao Ma
Huixia Li
Xiawu Zheng
Xuefeng Xiao
Rui Wang
Shilei Wen
Xin Pan
Rongrong Ji
Rongrong Ji
MQ
29
12
0
21 Mar 2023
A High-Performance Accelerator for Super-Resolution Processing on
  Embedded GPU
A High-Performance Accelerator for Super-Resolution Processing on Embedded GPU
W. Zhao
Qi Sun
Yang Bai
Wenbo Li
Haisheng Zheng
Bei Yu
Martin D. F. Wong
SupR
47
8
0
16 Mar 2023
Rotation Invariant Quantization for Model Compression
Rotation Invariant Quantization for Model Compression
Dor-Joseph Kampeas
Yury Nahshan
Hanoch Kremer
Gil Lederman
Shira Zaloshinski
Zheng Li
E. Haleva
MQ
23
1
0
03 Mar 2023
DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural
  Network Inference
DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference
Jiajun Zhou
Jiajun Wu
Yizhao Gao
Yuhao Ding
Chaofan Tao
Bo-wen Li
Fengbin Tu
Kwang-Ting Cheng
Hayden Kwok-Hay So
Ngai Wong
MQ
32
7
0
24 Feb 2023
Q-Diffusion: Quantizing Diffusion Models
Q-Diffusion: Quantizing Diffusion Models
Xiuyu Li
Yijia Liu
Long Lian
Hua Yang
Zhen Dong
Daniel Kang
Shanghang Zhang
Kurt Keutzer
DiffM
MQ
50
155
0
08 Feb 2023
ACQ: Improving Generative Data-free Quantization Via Attention
  Correction
ACQ: Improving Generative Data-free Quantization Via Attention Correction
Jixing Li
Xiaozhou Guo
Benzhe Dai
Guoliang Gong
Min Jin
Gang Chen
Wenyu Mao
Huaxiang Lu
MQ
32
4
0
18 Jan 2023
Hyperspherical Quantization: Toward Smaller and More Accurate Models
Hyperspherical Quantization: Toward Smaller and More Accurate Models
Dan Liu
X. Chen
Chen Ma
Xue Liu
MQ
30
3
0
24 Dec 2022
CSMPQ:Class Separability Based Mixed-Precision Quantization
CSMPQ:Class Separability Based Mixed-Precision Quantization
Ming-Yu Wang
Taisong Jin
Miaohui Zhang
Zhengtao Yu
MQ
33
0
0
20 Dec 2022
Redistribution of Weights and Activations for AdderNet Quantization
Redistribution of Weights and Activations for AdderNet Quantization
Ying Nie
Kai Han
Haikang Diao
Chuanjian Liu
Enhua Wu
Yunhe Wang
MQ
58
6
0
20 Dec 2022
Masked Wavelet Representation for Compact Neural Radiance Fields
Masked Wavelet Representation for Compact Neural Radiance Fields
Daniel Rho
Byeonghyeon Lee
Seungtae Nam
J. Lee
J. Ko
Eunbyung Park
44
52
0
18 Dec 2022
RepQ-ViT: Scale Reparameterization for Post-Training Quantization of
  Vision Transformers
RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers
Zhikai Li
Junrui Xiao
Lianwei Yang
Qingyi Gu
MQ
26
82
0
16 Dec 2022
PD-Quant: Post-Training Quantization based on Prediction Difference
  Metric
PD-Quant: Post-Training Quantization based on Prediction Difference Metric
Jiawei Liu
Lin Niu
Zhihang Yuan
Dawei Yang
Xinggang Wang
Wenyu Liu
MQ
98
70
0
14 Dec 2022
QFT: Post-training quantization via fast joint finetuning of all degrees
  of freedom
QFT: Post-training quantization via fast joint finetuning of all degrees of freedom
Alexander Finkelstein
Ella Fuchs
Idan Tal
Mark Grobman
Niv Vosco
Eldad Meller
MQ
32
6
0
05 Dec 2022
Post-training Quantization on Diffusion Models
Post-training Quantization on Diffusion Models
Yuzhang Shang
Zhihang Yuan
Bin Xie
Bingzhe Wu
Yan Yan
DiffM
MQ
17
160
0
28 Nov 2022
CPT-V: A Contrastive Approach to Post-Training Quantization of Vision
  Transformers
CPT-V: A Contrastive Approach to Post-Training Quantization of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
ViT
MQ
31
1
0
17 Nov 2022
Long-Range Zero-Shot Generative Deep Network Quantization
Long-Range Zero-Shot Generative Deep Network Quantization
Yan Luo
Yangcheng Gao
Zhao Zhang
Haijun Zhang
Mingliang Xu
Meng Wang
MQ
31
9
0
13 Nov 2022
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained
  Transformers
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Elias Frantar
Saleh Ashkboos
Torsten Hoefler
Dan Alistarh
MQ
33
904
0
31 Oct 2022
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of
  Large-Scale Pre-Trained Language Models
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
S. Kwon
Jeonghoon Kim
Jeongin Bae
Kang Min Yoo
Jin-Hwa Kim
Baeseong Park
Byeongwook Kim
Jung-Woo Ha
Nako Sung
Dongsoo Lee
MQ
29
30
0
08 Oct 2022
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language
  Models
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models
Xiuying Wei
Yunchen Zhang
Xiangguo Zhang
Ruihao Gong
Shanghang Zhang
Qi Zhang
F. Yu
Xianglong Liu
MQ
38
147
0
27 Sep 2022
PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for
  Vision Transformers
PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformers
Zhikai Li
Mengjuan Chen
Junrui Xiao
Qingyi Gu
ViT
MQ
55
33
0
13 Sep 2022
A simple approach for quantizing neural networks
A simple approach for quantizing neural networks
J. Maly
Rayan Saab
MQ
27
11
0
07 Sep 2022
Efficient Adaptive Activation Rounding for Post-Training Quantization
Efficient Adaptive Activation Rounding for Post-Training Quantization
Zhengyi Li
Cong Guo
Zhanda Zhu
Yangjie Zhou
Yuxian Qiu
Xiaotian Gao
Jingwen Leng
Minyi Guo
MQ
34
4
0
25 Aug 2022
Optimal Brain Compression: A Framework for Accurate Post-Training
  Quantization and Pruning
Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning
Elias Frantar
Sidak Pal Singh
Dan Alistarh
MQ
28
218
0
24 Aug 2022
FP8 Quantization: The Power of the Exponent
FP8 Quantization: The Power of the Exponent
Andrey Kuzmin
M. V. Baalen
Yuwei Ren
Markus Nagel
Jorn W. T. Peters
Tijmen Blankevoort
MQ
25
81
0
19 Aug 2022
Symmetry Regularization and Saturating Nonlinearity for Robust
  Quantization
Symmetry Regularization and Saturating Nonlinearity for Robust Quantization
Sein Park
Yeongsang Jang
Eunhyeok Park
MQ
23
2
0
31 Jul 2022
BiTAT: Neural Network Binarization with Task-dependent Aggregated
  Transformation
BiTAT: Neural Network Binarization with Task-dependent Aggregated Transformation
Geondo Park
Jaehong Yoon
H. Zhang
Xingge Zhang
Sung Ju Hwang
Yonina C. Eldar
MQ
31
1
0
04 Jul 2022
RAPQ: Rescuing Accuracy for Power-of-Two Low-bit Post-training
  Quantization
RAPQ: Rescuing Accuracy for Power-of-Two Low-bit Post-training Quantization
Hongyi Yao
Pu Li
Jian Cao
Xiangcheng Liu
Chenying Xie
Bin Wang
MQ
32
12
0
26 Apr 2022
Characterizing and Understanding the Behavior of Quantized Models for
  Reliable Deployment
Characterizing and Understanding the Behavior of Quantized Models for Reliable Deployment
Qiang Hu
Yuejun Guo
Maxime Cordy
Xiaofei Xie
Wei Ma
Mike Papadakis
Yves Le Traon
MQ
44
1
0
08 Apr 2022
QDrop: Randomly Dropping Quantization for Extremely Low-bit
  Post-Training Quantization
QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization
Xiuying Wei
Ruihao Gong
Yuhang Li
Xianglong Liu
F. Yu
MQ
VLM
19
168
0
11 Mar 2022
Patch Similarity Aware Data-Free Quantization for Vision Transformers
Patch Similarity Aware Data-Free Quantization for Vision Transformers
Zhikai Li
Liping Ma
Mengjuan Chen
Junrui Xiao
Qingyi Gu
MQ
ViT
24
44
0
04 Mar 2022
Post-Training Quantization for Cross-Platform Learned Image Compression
Post-Training Quantization for Cross-Platform Learned Image Compression
Dailan He
Zi Yang
Yuan-Hsin Chen
Qi Zhang
Hongwei Qin
Yan Wang
MQ
45
13
0
15 Feb 2022
SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian
  Approximation
SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation
Cong Guo
Yuxian Qiu
Jingwen Leng
Xiaotian Gao
Chen Zhang
Yunxin Liu
Fan Yang
Yuhao Zhu
Minyi Guo
MQ
74
70
0
14 Feb 2022
Energy awareness in low precision neural networks
Energy awareness in low precision neural networks
Nurit Spingarn-Eliezer
Ron Banner
Elad Hoffer
Hilla Ben-Yaacov
T. Michaeli
41
0
0
06 Feb 2022
COIN++: Neural Compression Across Modalities
COIN++: Neural Compression Across Modalities
Emilien Dupont
H. Loya
Milad Alizadeh
Adam Goliñski
Yee Whye Teh
Arnaud Doucet
63
83
0
30 Jan 2022
Post-training Quantization for Neural Networks with Provable Guarantees
Post-training Quantization for Neural Networks with Provable Guarantees
Jinjie Zhang
Yixuan Zhou
Rayan Saab
MQ
23
32
0
26 Jan 2022
Sharpness-aware Quantization for Deep Neural Networks
Sharpness-aware Quantization for Deep Neural Networks
Jing Liu
Jianfei Cai
Bohan Zhuang
MQ
27
24
0
24 Nov 2021
Previous
123
Next