Title |
---|
![]() SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight
Compression Tim Dettmers Ruslan Svirschevski Vage Egiazarian Denis Kuznedelev Elias Frantar Saleh Ashkboos Alexander Borzunov Torsten Hoefler Dan Alistarh |
![]() The case for 4-bit precision: k-bit Inference Scaling Laws Tim Dettmers Luke Zettlemoyer |