ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.14882
  4. Cited By
CalibQuant: 1-Bit KV Cache Quantization for Multimodal LLMs

CalibQuant: 1-Bit KV Cache Quantization for Multimodal LLMs

15 February 2025
Zeliang Zhang
Yifan Zhu
Susan Liang
Zhiyuan Wang
Jiani Liu
Haiting Lin
Mingjie Zhao
Chenliang Xu
Kun Wan
Wentian Zhao
Wentian Zhao
    VLM
    MQ
ArXivPDFHTML

Papers citing "CalibQuant: 1-Bit KV Cache Quantization for Multimodal LLMs"

4 / 4 papers shown
Title
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
Zefan Cai
Yichi Zhang
Bofei Gao
Yuliang Liu
Yongqian Li
...
Wayne Xiong
Yue Dong
Baobao Chang
Junjie Hu
Wen Xiao
88
92
0
04 Jun 2024
Mixtral of Experts
Mixtral of Experts
Albert Q. Jiang
Alexandre Sablayrolles
Antoine Roux
A. Mensch
Blanche Savary
...
Théophile Gervet
Thibaut Lavril
Thomas Wang
Timothée Lacroix
William El Sayed
MoE
LLMAG
61
1,049
0
08 Jan 2024
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Hongye Jin
Xiaotian Han
Jingfeng Yang
Zhimeng Jiang
Zirui Liu
Chia-Yuan Chang
Huiyuan Chen
Xia Hu
65
106
0
02 Jan 2024
H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large
  Language Models
H2_22​O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Zhenyu Zhang
Ying Sheng
Dinesh Manocha
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zhangyang Wang
Beidi Chen
VLM
84
275
0
24 Jun 2023
1