
v1v2 (latest)
Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization
Papers citing "Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization"
50 / 53 papers shown
Title |
---|
![]() Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models Jung Hwan Heo Jeonghoon Kim Beomseok Kwon Byeongwook Kim Se Jung Kwon Dongsoo Lee |
![]() Llama 2: Open Foundation and Fine-Tuned Chat Models Hugo Touvron Louis Martin Kevin R. Stone Peter Albert Amjad Almahairi ...Sharan Narang Aurelien Rodriguez Robert Stojnic Sergey Edunov Thomas Scialom |