
v1v2 (latest)
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
Wei Huang
Haotong Qin
Yangdong Liu
Yawei Li
Qinshuo Liu
Xianglong Liu
Luca Benini
Michele Magno
Shiming Zhang
Xiaojuan Qi
Author Contacts:
Papers citing "SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models"
50 / 63 papers shown
Title |
---|
![]() DB-LLM: Accurate Dual-Binarization for Efficient LLMs Hong Chen Chengtao Lv Liang Ding Haotong Qin Xiabin Zhou ...Xuebo Liu Min Zhang Jinyang Guo Xianglong Liu Dacheng Tao |
![]() Llama 2: Open Foundation and Fine-Tuned Chat Models Hugo Touvron Louis Martin Kevin R. Stone Peter Albert Amjad Almahairi ...Sharan Narang Aurelien Rodriguez Robert Stojnic Sergey Edunov Thomas Scialom |
![]() SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight
Compression Tim Dettmers Ruslan Svirschevski Vage Egiazarian Denis Kuznedelev Elias Frantar Saleh Ashkboos Alexander Borzunov Torsten Hoefler Dan Alistarh |