Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.05772
Cited By
Sparse Attention Remapping with Clustering for Efficient LLM Decoding on PIM
9 May 2025
Zehao Fan
Garrett Gagnon
Zhenyu Liu
Liu Liu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Sparse Attention Remapping with Clustering for Efficient LLM Decoding on PIM"
3 / 3 papers shown
Title
PIM-LLM: A High-Throughput Hybrid PIM Architecture for 1-bit LLMs
Jinendra Malekar
Peyton S. Chandarana
Md Hasibul Amin
Mohammed E. Elbtity
Ramtin Zand
60
1
0
31 Mar 2025
PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System
Yintao He
Haiyu Mao
Christina Giannoula
Mohammad Sadrosadati
Juan Gómez Luna
Huawei Li
Xiaowei Li
Ying Wang
O. Mutlu
91
8
0
21 Feb 2025
LoL-PIM: Long-Context LLM Decoding with Scalable DRAM-PIM System
Hyucksung Kwon
Kyungmo Koo
Janghyeon Kim
W. Lee
Minjae Lee
...
Yongkee Kwon
Ilkon Kim
Euicheol Lim
John Kim
Jungwook Choi
133
4
0
28 Dec 2024
1