Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.14882
Cited By
CalibQuant: 1-Bit KV Cache Quantization for Multimodal LLMs
15 February 2025
Zeliang Zhang
Yifan Zhu
Susan Liang
Zhiyuan Wang
Jiani Liu
Haiting Lin
Mingjie Zhao
Chenliang Xu
Kun Wan
Wentian Zhao
Wentian Zhao
VLM
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CalibQuant: 1-Bit KV Cache Quantization for Multimodal LLMs"
4 / 4 papers shown
Title
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
Zefan Cai
Yichi Zhang
Bofei Gao
Yuliang Liu
Yongqian Li
...
Wayne Xiong
Yue Dong
Baobao Chang
Junjie Hu
Wen Xiao
88
92
0
04 Jun 2024
Mixtral of Experts
Albert Q. Jiang
Alexandre Sablayrolles
Antoine Roux
A. Mensch
Blanche Savary
...
Théophile Gervet
Thibaut Lavril
Thomas Wang
Timothée Lacroix
William El Sayed
MoE
LLMAG
61
1,049
0
08 Jan 2024
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Hongye Jin
Xiaotian Han
Jingfeng Yang
Zhimeng Jiang
Zirui Liu
Chia-Yuan Chang
Huiyuan Chen
Xia Hu
65
106
0
02 Jan 2024
H
2
_2
2
O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Zhenyu Zhang
Ying Sheng
Dinesh Manocha
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zhangyang Wang
Beidi Chen
VLM
84
275
0
24 Jun 2023
1