Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.10685
Cited By
LongHeads: Multi-Head Attention is Secretly a Long Context Processor
16 February 2024
Yi Lu
Xin Zhou
Wei He
Jun Zhao
Tao Ji
Tao Gui
Qi Zhang
Xuanjing Huang
LLMAG
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LongHeads: Multi-Head Attention is Secretly a Long Context Processor"
4 / 4 papers shown
Title
FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension
Jushi Kai
Boyi Zeng
Yixuan Wang
Haoli Bai
Bo Jiang
Bo Jiang
Zhouhan Lin
42
0
0
01 May 2025
Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation
Yi Lu
Wanxu Zhao
Xin Zhou
Chenxin An
Cong Wang
...
Jun Zhao
Tao Ji
Tao Gui
Qi Zhang
Xuanjing Huang
39
0
0
26 Apr 2025
Cognitive Memory in Large Language Models
Lianlei Shan
Shixian Luo
Zezhou Zhu
Yu Yuan
Yong Wu
LLMAG
KELM
160
1
0
03 Apr 2025
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
253
695
0
27 Aug 2021
1