ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.00456
  4. Cited By
QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs

QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs

30 March 2024
Saleh Ashkboos
Amirkeivan Mohtashami
Maximilian L. Croci
Bo Li
Martin Jaggi
Dan Alistarh
Torsten Hoefler
James Hensman
    MQ
ArXiv (abs)PDFHTMLGithub (390★)

Papers citing "QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs"

9 / 59 papers shown
Title
FlatQuant: Flatness Matters for LLM Quantization
FlatQuant: Flatness Matters for LLM Quantization
Yuxuan Sun
Ruikang Liu
Haoli Bai
Han Bao
Kang Zhao
...
Lu Hou
Chun Yuan
Xin Jiang
Wen Liu
Jun Yao
MQ
176
11
0
12 Oct 2024
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation
Mike Ranzinger
Jon Barker
Greg Heinrich
Pavlo Molchanov
Bryan Catanzaro
Andrew Tao
102
5
0
02 Oct 2024
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Mengzhao Chen
Wenqi Shao
Peng Xu
Jiahao Wang
Peng Gao
Kaipeng Zhang
Ping Luo
MQ
158
35
0
10 Jul 2024
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
Xingrun Xing
Boyan Gao
Zheng Zhang
David A. Clifton
Shitao Xiao
Li Du
Guoqi Li
Jiajun Zhang
170
6
0
05 Jul 2024
QTIP: Quantization with Trellises and Incoherence Processing
QTIP: Quantization with Trellises and Incoherence Processing
Albert Tseng
Qingyao Sun
David Hou
Christopher De Sa
MQ
121
18
0
17 Jun 2024
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
Tianchen Zhao
Tongcheng Fang
Haofeng Huang
Enshu Liu
Widyadewi Soedarmadji
...
Shengen Yan
Huazhong Yang
Xuefei Ning
Xuefei Ning
Yu Wang
MQVGen
195
35
0
04 Jun 2024
BiSup: Bidirectional Quantization Error Suppression for Large Language
  Models
BiSup: Bidirectional Quantization Error Suppression for Large Language Models
Minghui Zou
Ronghui Guo
Sai Zhang
Xiaowang Zhang
Zhiyong Feng
MQ
82
1
0
24 May 2024
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Chengyue Wu
Haotian Tang
Shang Yang
Zhekai Zhang
Guangxuan Xiao
Chuang Gan
Song Han
172
98
0
07 May 2024
IntactKV: Improving Large Language Model Quantization by Keeping Pivot
  Tokens Intact
IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact
Ruikang Liu
Haoli Bai
Haokun Lin
Yuening Li
Han Gao
Zheng-Jun Xu
Lu Hou
Jun Yao
Chun Yuan
MQ
84
32
0
02 Mar 2024
Previous
12