
v1v2 (latest)
DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMs
Papers citing "DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMs"
18 / 18 papers shown
Title |
---|
![]() Llama 2: Open Foundation and Fine-Tuned Chat Models Hugo Touvron Louis Martin Kevin R. Stone Peter Albert Amjad Almahairi ...Sharan Narang Aurelien Rodriguez Robert Stojnic Sergey Edunov Thomas Scialom |
![]() SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight
Compression Tim Dettmers Ruslan Svirschevski Vage Egiazarian Denis Kuznedelev Elias Frantar Saleh Ashkboos Alexander Borzunov Torsten Hoefler Dan Alistarh |
![]() The case for 4-bit precision: k-bit Inference Scaling Laws Tim Dettmers Luke Zettlemoyer |