Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2506.03510
Cited By
Accurate Sublayer Pruning for Large Language Models by Exploiting Latency and Tunability Information
4 June 2025
Seungcheol Park
Sojin Lee
Jongjin Kim
Jinsik Lee
Hyunjik Jo
U. Kang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Accurate Sublayer Pruning for Large Language Models by Exploiting Latency and Tunability Information"
2 / 2 papers shown
Title
Unifying Uniform and Binary-coding Quantization for Accurate Compression of Large Language Models
Seungcheol Park
Jeongin Bae
Beomseok Kwon
Minjun Kim
Byeongwook Kim
S. Kwon
U. Kang
Dongsoo Lee
MQ
122
0
0
04 Jun 2025
Zero-shot Quantization: A Comprehensive Survey
Minjun Kim
Jaehyeon Choi
Jongkeun Lee
Wonjin Cho
U. Kang
MQ
90
2
0
14 May 2025
1