ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.04503
  4. Cited By
VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision
  Neural Network Inference

VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference

8 February 2021
Steve Dai
Rangharajan Venkatesan
Haoxing Ren
B. Zimmer
W. Dally
Brucek Khailany
    MQ
ArXivPDFHTML

Papers citing "VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference"

14 / 14 papers shown
Title
BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration
Yuzong Chen
Ahmed F. AbouElhamayed
Xilai Dai
Yang Wang
Marta Andronic
George A. Constantinides
Mohamed S. Abdelfattah
MQ
108
1
0
18 Nov 2024
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
Haocheng Xi
Han Cai
Ligeng Zhu
Yunfan LU
Kurt Keutzer
Jianfei Chen
Song Han
MQ
75
9
0
25 Oct 2024
Exploring FPGA designs for MX and beyond
Exploring FPGA designs for MX and beyond
Ebby Samson
Naveen Mellempudi
Wayne Luk
George A. Constantinides
MQ
32
1
0
01 Jul 2024
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Simla Burcu Harma
Ayan Chakraborty
Elizaveta Kostenok
Danila Mishin
Dongho Ha
...
Martin Jaggi
Ming Liu
Yunho Oh
Suvinay Subramanian
Amir Yazdanbakhsh
MQ
44
6
0
31 May 2024
Learning from Students: Applying t-Distributions to Explore Accurate and
  Efficient Formats for LLMs
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Jordan Dotzel
Yuzong Chen
Bahaa Kotb
Sushma Prasad
Gang Wu
Sheng Li
Mohamed S. Abdelfattah
Zhiru Zhang
31
8
0
06 May 2024
Instance-Aware Group Quantization for Vision Transformers
Instance-Aware Group Quantization for Vision Transformers
Jaehyeon Moon
Dohyung Kim
Junyong Cheon
Bumsub Ham
MQ
ViT
29
7
0
01 Apr 2024
INT-FP-QSim: Mixed Precision and Formats For Large Language Models and
  Vision Transformers
INT-FP-QSim: Mixed Precision and Formats For Large Language Models and Vision Transformers
Lakshmi Nair
Mikhail Bernadskiy
Arulselvan Madhavan
Craig Chan
Ayon Basumallik
D. Bunandar
MQ
40
2
0
07 Jul 2023
RPTQ: Reorder-based Post-training Quantization for Large Language Models
RPTQ: Reorder-based Post-training Quantization for Large Language Models
Zhihang Yuan
Lin Niu
Jia-Wen Liu
Wenyu Liu
Xinggang Wang
Yuzhang Shang
Guangyu Sun
Qiang Wu
Jiaxiang Wu
Bingzhe Wu
MQ
35
79
0
03 Apr 2023
With Shared Microexponents, A Little Shifting Goes a Long Way
With Shared Microexponents, A Little Shifting Goes a Long Way
Bita Darvish Rouhani
Ritchie Zhao
V. Elango
Rasoul Shafipour
Mathew Hall
...
Eric S. Chung
Zhaoxia Deng
S. Naghshineh
Jongsoo Park
Maxim Naumov
MQ
43
36
0
16 Feb 2023
Hyperspherical Quantization: Toward Smaller and More Accurate Models
Hyperspherical Quantization: Toward Smaller and More Accurate Models
Dan Liu
X. Chen
Chen Ma
Xue Liu
MQ
30
3
0
24 Dec 2022
Block Format Error Bounds and Optimal Block Size Selection
Block Format Error Bounds and Optimal Block Size Selection
I. Soloveychik
I. Lyubomirsky
Xin Wang
S. Bhoja
MQ
29
4
0
11 Oct 2022
DAQ: Channel-Wise Distribution-Aware Quantization for Deep Image
  Super-Resolution Networks
DAQ: Channel-Wise Distribution-Aware Quantization for Deep Image Super-Resolution Networks
Chee Hong
Heewon Kim
Sungyong Baik
Junghun Oh
Kyoung Mu Lee
OOD
SupR
MQ
24
41
0
21 Dec 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
233
576
0
12 Sep 2019
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,746
0
26 Sep 2016
1