Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.04992
Cited By
InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference
8 September 2024
Xiurui Pan
Endian Li
Qiao Li
Shengwen Liang
Yizhou Shan
Ke Zhou
Yingwei Luo
Xiaolin Wang
Jie Zhang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference"
9 / 9 papers shown
Title
Cognitive Memory in Large Language Models
Lianlei Shan
Shixian Luo
Zezhou Zhu
Yu Yuan
Yong Wu
LLMAG
KELM
508
3
0
03 Apr 2025
iServe: An Intent-based Serving System for LLMs
Dimitrios Liakopoulos
Tianrui Hu
Prasoon Sinha
N. Yadwadkar
VLM
505
0
0
08 Jan 2025
Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security
Yuanchun Li
Hao Wen
Weijun Wang
Xiangyu Li
Yizhen Yuan
...
Zhijun Li
Peng Li
Yang Liu
Yaqiong Zhang
Yunxin Liu
LLMAG
81
189
0
10 Jan 2024
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Hongye Jin
Xiaotian Han
Jingfeng Yang
Zhimeng Jiang
Zirui Liu
Chia-Yuan Chang
Huiyuan Chen
Helen Zhou
108
117
0
02 Jan 2024
H
2
_2
2
O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Zhenyu Zhang
Ying Sheng
Dinesh Manocha
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zhangyang Wang
Beidi Chen
VLM
147
313
0
24 Jun 2023
Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method
Yiming Wang
Zhuosheng Zhang
Rui Wang
112
87
0
22 May 2023
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
1.5K
14,748
0
15 Mar 2023
An Attentive Survey of Attention Models
S. Chaudhari
Varun Mithal
Gungor Polatkan
R. Ramanath
149
662
0
05 Apr 2019
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
Mandar Joshi
Eunsol Choi
Daniel S. Weld
Luke Zettlemoyer
RALM
231
2,686
0
09 May 2017
1