Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.05265
Cited By
PrefixQuant: Eliminating Outliers by Prefixed Tokens for Large Language Models Quantization
28 January 2025
Mengzhao Chen
Yi Liu
Jiahao Wang
Yi Bin
Wenqi Shao
Ping Luo
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PrefixQuant: Eliminating Outliers by Prefixed Tokens for Large Language Models Quantization"
4 / 4 papers shown
Title
MiniCPM4: Ultra-Efficient LLMs on End Devices
MiniCPM Team
Chaojun Xiao
Yuxuan Li
Xu Han
Yuzhuo Bai
...
Zhiyuan Liu
Guoyang Zeng
Chao Jia
Dahai Li
Maosong Sun
MLLM
41
0
0
09 Jun 2025
Gradual Binary Search and Dimension Expansion : A general method for activation quantization in LLMs
Lucas Maisonnave
Cyril Moineau
Olivier Bichler
Fabrice Rastello
MQ
126
0
0
18 Apr 2025
SpinQuant: LLM quantization with learned rotations
Zechun Liu
Changsheng Zhao
Igor Fedorov
Bilge Soran
Dhruv Choudhary
Raghuraman Krishnamoorthi
Vikas Chandra
Yuandong Tian
Tijmen Blankevoort
MQ
270
126
0
21 Feb 2025
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Chengyue Wu
Haotian Tang
Shang Yang
Zhekai Zhang
Guangxuan Xiao
Chuang Gan
Song Han
175
98
0
07 May 2024
1