ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.05772
  4. Cited By
Sparse Attention Remapping with Clustering for Efficient LLM Decoding on PIM

Sparse Attention Remapping with Clustering for Efficient LLM Decoding on PIM

9 May 2025
Zehao Fan
Garrett Gagnon
Zhenyu Liu
Liu Liu
ArXiv (abs)PDFHTML

Papers citing "Sparse Attention Remapping with Clustering for Efficient LLM Decoding on PIM"

3 / 3 papers shown
Title
PIM-LLM: A High-Throughput Hybrid PIM Architecture for 1-bit LLMs
PIM-LLM: A High-Throughput Hybrid PIM Architecture for 1-bit LLMs
Jinendra Malekar
Peyton S. Chandarana
Md Hasibul Amin
Mohammed E. Elbtity
Ramtin Zand
60
1
0
31 Mar 2025
PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System
PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System
Yintao He
Haiyu Mao
Christina Giannoula
Mohammad Sadrosadati
Juan Gómez Luna
Huawei Li
Xiaowei Li
Ying Wang
O. Mutlu
91
8
0
21 Feb 2025
LoL-PIM: Long-Context LLM Decoding with Scalable DRAM-PIM System
LoL-PIM: Long-Context LLM Decoding with Scalable DRAM-PIM System
Hyucksung Kwon
Kyungmo Koo
Janghyeon Kim
W. Lee
Minjae Lee
...
Yongkee Kwon
Ilkon Kim
Euicheol Lim
John Kim
Jungwook Choi
133
4
0
28 Dec 2024
1