Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.12866
Cited By
Beyond KV Caching: Shared Attention for Efficient LLMs
13 July 2024
Bingli Liao
Danilo Vasconcellos Vargas
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond KV Caching: Shared Attention for Efficient LLMs"
3 / 3 papers shown
Title
Cognitive Memory in Large Language Models
Lianlei Shan
Shixian Luo
Zezhou Zhu
Yu Yuan
Yong Wu
LLMAG
KELM
235
1
0
03 Apr 2025
A Systematic Study of Cross-Layer KV Sharing for Efficient LLM Inference
You Wu
Haoyi Wu
Kewei Tu
34
3
0
18 Oct 2024
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
304
6,996
0
20 Apr 2018
1