ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.12929
  4. Cited By
Quantizable Transformers: Removing Outliers by Helping Attention Heads
  Do Nothing
v1v2 (latest)

Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing

22 June 2023
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
    MQ
ArXiv (abs)PDFHTML

Papers citing "Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing"

19 / 19 papers shown
Title
SLICK: Selective Localization and Instance Calibration for Knowledge-Enhanced Car Damage Segmentation in Automotive Insurance
SLICK: Selective Localization and Instance Calibration for Knowledge-Enhanced Car Damage Segmentation in Automotive Insurance
Teerapong Panboonyuen
150
0
0
12 Jun 2025
Why Do Some Inputs Break Low-Bit LLM Quantization?
Why Do Some Inputs Break Low-Bit LLM Quantization?
Ting-Yun Chang
Muru Zhang
Jesse Thomason
Robin Jia
MQ
22
0
0
24 May 2025
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Tollef Emil Jørgensen
MQ
95
0
0
13 May 2025
Fast and Low-Cost Genomic Foundation Models via Outlier Removal
Fast and Low-Cost Genomic Foundation Models via Outlier Removal
Haozheng Luo
Chenghao Qiu
Maojiang Su
Zhihan Zhou
Zoe Mehta
Guo Ye
Jerry Yao-Chieh Hu
Han Liu
AAML
107
1
0
01 May 2025
Softpick: No Attention Sink, No Massive Activations with Rectified Softmax
Softpick: No Attention Sink, No Massive Activations with Rectified Softmax
Zayd Muhammad Kawakibi Zuhri
Erland Hilman Fuadi
Alham Fikri Aji
54
0
0
29 Apr 2025
Outlier dimensions favor frequent tokens in language models
Outlier dimensions favor frequent tokens in language models
Iuri Macocco
Nora Graichen
Gemma Boleda
Marco Baroni
132
1
0
27 Mar 2025
ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning
ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning
Baohao Liao
Christian Herold
Seyyed Hadi Hashemi
Stefan Vasilev
Shahram Khadivi
Christof Monz
MQ
130
0
0
17 Mar 2025
See What You Are Told: Visual Attention Sink in Large Multimodal Models
Seil Kang
Jinyeong Kim
Junhyeok Kim
Seong Jae Hwang
VLM
166
10
0
05 Mar 2025
SpinQuant: LLM quantization with learned rotations
SpinQuant: LLM quantization with learned rotations
Zechun Liu
Changsheng Zhao
Igor Fedorov
Bilge Soran
Dhruv Choudhary
Raghuraman Krishnamoorthi
Vikas Chandra
Yuandong Tian
Tijmen Blankevoort
MQ
261
126
0
21 Feb 2025
SEAL: Scaling to Emphasize Attention for Long-Context Retrieval
SEAL: Scaling to Emphasize Attention for Long-Context Retrieval
Changhun Lee
Jun-gyu Jin
Jun-gyu Jin
Younghyun Cho
Eunhyeok Park
RALMLRM
114
0
0
25 Jan 2025
DecDEC: A Systems Approach to Advancing Low-Bit LLM Quantization
DecDEC: A Systems Approach to Advancing Low-Bit LLM Quantization
Y. Park
Jake Hyun
Hojoon Kim
Jae W. Lee
MQ
122
0
0
28 Dec 2024
ControlMM: Controllable Masked Motion Generation
ControlMM: Controllable Masked Motion Generation
Ekkasit Pinyoanuntapong
Muhammad Usama Saleem
Korrawe Karunratanakul
Pu Wang
Hongfei Xue
Chong Chen
Chuan Guo
Junli Cao
J. Ren
Sergey Tulyakov
VGen
92
7
0
14 Oct 2024
Differential Transformer
Differential Transformer
Tianzhu Ye
Li Dong
Yuqing Xia
Yutao Sun
Yi Zhu
Gao Huang
Furu Wei
501
0
0
07 Oct 2024
Beat this! Accurate beat tracking without DBN postprocessing
Beat this! Accurate beat tracking without DBN postprocessing
Francesco Foscarin
Jan Schluter
Gerhard Widmer
74
7
0
31 Jul 2024
u-$\mu$P: The Unit-Scaled Maximal Update Parametrization
u-μ\muμP: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
C. Eichenberg
Josef Dean
Lukas Balles
Luke Y. Prince
Bjorn Deiseroth
Andres Felipe Cruz Salinas
Carlo Luschi
Samuel Weinbach
Douglas Orr
123
10
0
24 Jul 2024
QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven
  Fine Tuning
QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine Tuning
Jiun-Man Chen
Yu-Hsuan Chao
Yu-Jie Wang
Ming-Der Shieh
Chih-Chung Hsu
Wei-Fen Lin
MQ
84
1
0
11 Mar 2024
IntactKV: Improving Large Language Model Quantization by Keeping Pivot
  Tokens Intact
IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact
Ruikang Liu
Haoli Bai
Haokun Lin
Yuening Li
Han Gao
Zheng-Jun Xu
Lu Hou
Jun Yao
Chun Yuan
MQ
84
32
0
02 Mar 2024
Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models
Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models
Jung Hwan Heo
Jeonghoon Kim
Beomseok Kwon
Byeongwook Kim
Se Jung Kwon
Dongsoo Lee
MQ
129
10
0
27 Sep 2023
Training-Free Acceleration of ViTs with Delayed Spatial Merging
Training-Free Acceleration of ViTs with Delayed Spatial Merging
J. Heo
Seyedarmin Azizi
A. Fayyazi
Massoud Pedram
123
3
0
04 Mar 2023
1