Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.13330
Cited By
Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference
30 January 2023
Deepika Bablani
J. McKinstry
S. K. Esser
R. Appuswamy
D. Modha
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference"
3 / 3 papers shown
Title
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs
Yan Yang
Yixia Li
Hongru Wang
Xuetao Wei
Jianqiao Yu
Yun-Nung Chen
Guanhua Chen
MoMe
26
0
0
17 Apr 2025
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models
Bowen Ping
Shuo Wang
Hanqing Wang
Xu Han
Yuzhuang Xu
Yukun Yan
Yun Chen
Baobao Chang
Zhiyuan Liu
Maosong Sun
MQ
43
4
0
13 Jun 2024
Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization
Weihan Chen
Peisong Wang
Jian Cheng
MQ
33
61
0
13 Oct 2021
1