Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.19586
Cited By
v1
v2 (latest)
TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization
26 May 2025
Dingyu Yao
Bowen Shen
Zheng Lin
Wei Liu
Jian Luan
Bin Wang
Weiping Wang
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization"
3 / 3 papers shown
Title
Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference
Yuan Feng
Junlin Lv
Yukun Cao
Xike Xie
S. K. Zhou
VLM
152
44
0
28 Jan 2025
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
Di Liu
Meng Chen
Baotong Lu
Huiqiang Jiang
Zhenhua Han
...
Kai Zhang
Chong Chen
Fan Yang
Yue Yang
Lili Qiu
150
45
0
03 Jan 2025
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
Zefan Cai
Yichi Zhang
Bofei Gao
Yuliang Liu
Yongqian Li
...
Wayne Xiong
Yue Dong
Baobao Chang
Junjie Hu
Wen Xiao
200
107
0
04 Jun 2024
1