Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models

13 June 2024

Shuo Wang

Hanqing Wang

Xu Han

Yun Chen

Zhiyuan Liu

Maosong Sun

Papers citing "Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models"

1 / 1 papers shown

Title
Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge Xuan Shen Peiyan Dong Lei Lu Zhenglun Kong Zhengang Li Ming Lin Chao Wu Yanzhi Wang MQ 39 24 0 09 Dec 2023