ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.10685
  4. Cited By
LongHeads: Multi-Head Attention is Secretly a Long Context Processor

LongHeads: Multi-Head Attention is Secretly a Long Context Processor

16 February 2024
Yi Lu
Xin Zhou
Wei He
Jun Zhao
Tao Ji
Tao Gui
Qi Zhang
Xuanjing Huang
    LLMAG
ArXivPDFHTML

Papers citing "LongHeads: Multi-Head Attention is Secretly a Long Context Processor"

4 / 4 papers shown
Title
FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension
FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension
Jushi Kai
Boyi Zeng
Yansen Wang
Haoli Bai
Bo Jiang
Bo Jiang
Zhouhan Lin
42
0
0
01 May 2025
Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation
Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation
Yi Lu
Wanxu Zhao
Xin Zhou
Chenxin An
Cong Wang
...
Jun Zhao
Tao Ji
Tao Gui
Qi Zhang
Xuanjing Huang
39
0
0
26 Apr 2025
Cognitive Memory in Large Language Models
Cognitive Memory in Large Language Models
Lianlei Shan
Shixian Luo
Zezhou Zhu
Yu Yuan
Yong Wu
LLMAG
KELM
172
1
0
03 Apr 2025
Train Short, Test Long: Attention with Linear Biases Enables Input
  Length Extrapolation
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
253
695
0
27 Aug 2021
1