Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.00922
Cited By
Huff-LLM: End-to-End Lossless Compression for Efficient LLM Inference
2 February 2025
Patrick Yubeaton
Tareq Mahmoud
Shehab Naga
Pooria Taheri
Tianhua Xia
Arun George
Yasmein Khalil
Sai Qian Zhang
Siddharth Joshi
Chinmay Hegde
Siddharth Garg
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Huff-LLM: End-to-End Lossless Compression for Efficient LLM Inference"
1 / 1 papers shown
Title
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
Tianyi Zhang
Yang Sui
Shaochen Zhong
Vipin Chaudhary
Helen Zhou
Anshumali Shrivastava
MQ
63
2
0
15 Apr 2025
1