ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.07471
  4. Cited By
SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian
  Approximation

SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation

14 February 2022
Cong Guo
Yuxian Qiu
Jingwen Leng
Xiaotian Gao
Chen Zhang
Yunxin Liu
Fan Yang
Yuhao Zhu
Minyi Guo
    MQ
ArXiv (abs)PDFHTMLGithub (131★)

Papers citing "SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation"

36 / 36 papers shown
Title
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Mingliang Xu
Yuyao Zhou
Yuxin Zhang
Shen Li
Yong Li
Chia-Wen Lin
Zhanpeng Zeng
Rongrong Ji
MQ
316
0
0
31 Dec 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
468
0
0
29 Oct 2024
Block-Skim: Efficient Question Answering for Transformer
Block-Skim: Efficient Question Answering for Transformer
Yue Guan
Zhengyi Li
Jingwen Leng
Zhouhan Lin
Minyi Guo
Yuhao Zhu
71
32
0
16 Dec 2021
Dual-side Sparse Tensor Core
Dual-side Sparse Tensor Core
Yang-Feng Wang
Chen Zhang
Zhiqiang Xie
Cong Guo
Yunxin Liu
Jingwen Leng
83
75
0
20 May 2021
Zero-shot Adversarial Quantization
Zero-shot Adversarial Quantization
Yuang Liu
Wei Zhang
Jun Wang
MQ
107
79
0
29 Mar 2021
BRECQ: Pushing the Limit of Post-Training Quantization by Block
  Reconstruction
BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction
Yuhang Li
Ruihao Gong
Xu Tan
Yang Yang
Peng Hu
Qi Zhang
F. Yu
Wei Wang
Shi Gu
MQ
153
444
0
10 Feb 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple
  and Efficient Sparsity
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
88
2,234
0
11 Jan 2021
How Far Does BERT Look At:Distance-based Clustering and Analysis of
  BERT$'$s Attention
How Far Does BERT Look At:Distance-based Clustering and Analysis of BERT′'′s Attention
Yue Guan
Jingwen Leng
Chao Li
Quan Chen
Minyi Guo
56
19
0
02 Nov 2020
Dissecting Hessian: Understanding Common Structure of Hessian in Neural
  Networks
Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks
Yikai Wu
Xingyu Zhu
Chenwei Wu
Annie Wang
Rong Ge
110
45
0
08 Oct 2020
Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise
  Sparsity
Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity
Cong Guo
B. Hsueh
Jingwen Leng
Yuxian Qiu
Yue Guan
Zehuan Wang
Xiaoying Jia
Xipeng Li
Minyi Guo
Yuhao Zhu
71
83
0
29 Aug 2020
Channel-wise Hessian Aware trace-Weighted Quantization of Neural
  Networks
Channel-wise Hessian Aware trace-Weighted Quantization of Neural Networks
Xu Qian
Victor Li
Darren Crews
MQ
44
9
0
19 Aug 2020
Sparse GPU Kernels for Deep Learning
Sparse GPU Kernels for Deep Learning
Trevor Gale
Matei A. Zaharia
C. Young
Erich Elsen
80
234
0
18 Jun 2020
Improving Post Training Neural Quantization: Layer-wise Calibration and
  Integer Programming
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming
Itay Hubara
Yury Nahshan
Y. Hanani
Ron Banner
Daniel Soudry
MQ
109
128
0
14 Jun 2020
Data-Free Network Quantization With Adversarial Knowledge Distillation
Data-Free Network Quantization With Adversarial Knowledge Distillation
Yoojin Choi
Jihwan P. Choi
Mostafa El-Khamy
Jungwon Lee
MQ
74
121
0
08 May 2020
Up or Down? Adaptive Rounding for Post-Training Quantization
Up or Down? Adaptive Rounding for Post-Training Quantization
Markus Nagel
Rana Ali Amjad
M. V. Baalen
Christos Louizos
Tijmen Blankevoort
MQ
95
588
0
22 Apr 2020
Generative Low-bitwidth Data Free Quantization
Generative Low-bitwidth Data Free Quantization
Shoukai Xu
Haokun Li
Bohan Zhuang
Jing Liu
Jingyun Liang
Chuangrun Liang
Mingkui Tan
MQ
60
127
0
07 Mar 2020
SpArch: Efficient Architecture for Sparse Matrix Multiplication
SpArch: Efficient Architecture for Sparse Matrix Multiplication
Zhekai Zhang
Hanrui Wang
Song Han
W. Dally
71
233
0
20 Feb 2020
Balancing Efficiency and Flexibility for DNN Acceleration via Temporal
  GPU-Systolic Array Integration
Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration
Cong Guo
Yangjie Zhou
Jingwen Leng
Yuhao Zhu
Zidong Du
Quan Chen
Chao Li
Bin Yao
Minyi Guo
52
33
0
18 Feb 2020
ZeroQ: A Novel Zero Shot Quantization Framework
ZeroQ: A Novel Zero Shot Quantization Framework
Yaohui Cai
Z. Yao
Zhen Dong
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
101
399
0
01 Jan 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
565
42,639
0
03 Dec 2019
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
Zhen Dong
Z. Yao
Yaohui Cai
Daiyaan Arfeen
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
95
282
0
10 Nov 2019
Effective Training of Convolutional Neural Networks with Low-bitwidth
  Weights and Activations
Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations
Bohan Zhuang
Jing Liu
Mingkui Tan
Lingqiao Liu
Ian Reid
Chunhua Shen
MQ
72
46
0
10 Aug 2019
Data-Free Quantization Through Weight Equalization and Bias Correction
Data-Free Quantization Through Weight Equalization and Bias Correction
Markus Nagel
M. V. Baalen
Tijmen Blankevoort
Max Welling
MQ
75
515
0
11 Jun 2019
HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
Zhen Dong
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
92
528
0
29 Apr 2019
Adversarial Defense Through Network Profiling Based Path Extraction
Adversarial Defense Through Network Profiling Based Path Extraction
Yuxian Qiu
Jingwen Leng
Cong Guo
Quan Chen
Chong Li
Minyi Guo
Yuhao Zhu
AAML
66
51
0
17 Apr 2019
Low-bit Quantization of Neural Networks for Efficient Inference
Low-bit Quantization of Neural Networks for Efficient Inference
Yoni Choukroun
Eli Kravchik
Fan Yang
P. Kisilev
MQ
82
364
0
18 Feb 2019
Improving Neural Network Quantization without Retraining using Outlier
  Channel Splitting
Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
Ritchie Zhao
Yuwei Hu
Jordan Dotzel
Christopher De Sa
Zhiru Zhang
OODDMQ
119
311
0
28 Jan 2019
SqueezeNext: Hardware-Aware Neural Network Design
SqueezeNext: Hardware-Aware Neural Network Design
A. Gholami
K. Kwon
Bichen Wu
Zizheng Tai
Xiangyu Yue
Peter H. Jin
Sicheng Zhao
Kurt Keutzer
62
299
0
23 Mar 2018
Quantization and Training of Neural Networks for Efficient
  Integer-Arithmetic-Only Inference
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
MQ
167
3,148
0
15 Dec 2017
ShuffleNet: An Extremely Efficient Convolutional Neural Network for
  Mobile Devices
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
Xiangyu Zhang
Xinyu Zhou
Mengxiao Lin
Jian Sun
AI4TS
152
6,896
0
04 Jul 2017
Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain
  Surgeon
Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon
Xin Luna Dong
Shangyu Chen
Sinno Jialin Pan
183
507
0
22 May 2017
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.3K
194,510
0
10 Dec 2015
Rethinking the Inception Architecture for Computer Vision
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DVBDL
886
27,427
0
02 Dec 2015
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han
Huizi Mao
W. Dally
3DGS
263
8,864
0
01 Oct 2015
Learning both Weights and Connections for Efficient Neural Networks
Learning both Weights and Connections for Efficient Neural Networks
Song Han
Jeff Pool
J. Tran
W. Dally
CVBM
316
6,709
0
08 Jun 2015
Deep Learning with Limited Numerical Precision
Deep Learning with Limited Numerical Precision
Suyog Gupta
A. Agrawal
K. Gopalakrishnan
P. Narayanan
HAI
209
2,049
0
09 Feb 2015
1