ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1911.05507
  4. Cited By
Compressive Transformers for Long-Range Sequence Modelling

Compressive Transformers for Long-Range Sequence Modelling

13 November 2019
Jack W. Rae
Anna Potapenko
Siddhant M. Jayakumar
Timothy Lillicrap
    RALMVLMKELM
ArXiv (abs)PDFHTML

Papers citing "Compressive Transformers for Long-Range Sequence Modelling"

50 / 232 papers shown
Title
LazyEviction: Lagged KV Eviction with Attention Pattern Observation for Efficient Long Reasoning
LazyEviction: Lagged KV Eviction with Attention Pattern Observation for Efficient Long Reasoning
Haoyue Zhang
Hualei Zhang
Xiaosong Ma
Jie Zhang
Song Guo
LRM
20
0
0
19 Jun 2025
Long-Short Alignment for Effective Long-Context Modeling in LLMs
Long-Short Alignment for Effective Long-Context Modeling in LLMs
Tianqi Du
Haotian Huang
Yifei Wang
Yisen Wang
21
0
0
13 Jun 2025
MesaNet: Sequence Modeling by Locally Optimal Test-Time Training
J. Oswald
Nino Scherrer
Seijin Kobayashi
Luca Versari
Songlin Yang
...
Guillaume Lajoie
Charlotte Frenkel
Razvan Pascanu
Blaise Agüera y Arcas
João Sacramento
106
1
0
05 Jun 2025
Temporal Chunking Enhances Recognition of Implicit Sequential Patterns
Temporal Chunking Enhances Recognition of Implicit Sequential Patterns
Jayanta Dey
Nicholas Soures
Miranda Gonzales
Itamar Lerner
Christopher Kanan
Dhireesha Kudithipudi
35
0
0
31 May 2025
RAD: Redundancy-Aware Distillation for Hybrid Models via Self-Speculative Decoding
RAD: Redundancy-Aware Distillation for Hybrid Models via Self-Speculative Decoding
Yuichiro Hoshino
Hideyuki Tachibana
Muneyoshi Inahara
Hiroto Takegawa
76
0
0
28 May 2025
Sparsified State-Space Models are Efficient Highway Networks
Sparsified State-Space Models are Efficient Highway Networks
Woomin Song
Jihoon Tack
Sangwoo Mo
Seunghyuk Oh
Jinwoo Shin
Mamba
41
0
0
27 May 2025
SpecExtend: A Drop-in Enhancement for Speculative Decoding of Long Sequences
SpecExtend: A Drop-in Enhancement for Speculative Decoding of Long Sequences
Jungyoub Cha
Hyunjong Kim
Sungzoon Cho
VLM
80
0
0
27 May 2025
DISRetrieval: Harnessing Discourse Structure for Long Document Retrieval
DISRetrieval: Harnessing Discourse Structure for Long Document Retrieval
H. Chen
Yi Yang
Yinghui Li
Meishan Zhang
Min Zhang
RALM
20
0
0
26 May 2025
Accelerating Prefilling for Long-Context LLMs via Sparse Pattern Sharing
Accelerating Prefilling for Long-Context LLMs via Sparse Pattern Sharing
Dan Peng
Zhihui Fu
Zewen Ye
Zhuoran Song
Jun Wang
46
0
0
26 May 2025
SELF: Self-Extend the Context Length With Logistic Growth Function
SELF: Self-Extend the Context Length With Logistic Growth Function
Phat Thanh Dang
Saahil Thoppay
Wang Yang
Qifan Wang
Vipin Chaudhary
Xiaotian Han
110
0
0
22 May 2025
LLM-Based Emulation of the Radio Resource Control Layer: Towards AI-Native RAN Protocols
LLM-Based Emulation of the Radio Resource Control Layer: Towards AI-Native RAN Protocols
Ziming Liu
Bryan Liu
Alvaro Valcarce
Xiaoli Chu
246
1
0
22 May 2025
PSC: Extending Context Window of Large Language Models via Phase Shift Calibration
PSC: Extending Context Window of Large Language Models via Phase Shift Calibration
Wenqiao Zhu
Chao Xu
Lulu Wang
Jun Wu
107
1
0
18 May 2025
Enhancing Cache-Augmented Generation (CAG) with Adaptive Contextual Compression for Scalable Knowledge Integration
Enhancing Cache-Augmented Generation (CAG) with Adaptive Contextual Compression for Scalable Knowledge Integration
Rishabh Agrawal
Himanshu Kumar
92
0
0
13 May 2025
Sparse Attention Remapping with Clustering for Efficient LLM Decoding on PIM
Sparse Attention Remapping with Clustering for Efficient LLM Decoding on PIM
Zehao Fan
Garrett Gagnon
Zhenyu Liu
Liu Liu
61
0
0
09 May 2025
Breaking Quadratic Barriers: A Non-Attention LLM for Ultra-Long Context Horizons
Breaking Quadratic Barriers: A Non-Attention LLM for Ultra-Long Context Horizons
Andrew Kiruluta
Preethi Raju
Priscilla Burity
30
0
0
09 May 2025
FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension
FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension
Jushi Kai
Boyi Zeng
Yansen Wang
Haoli Bai
Ziwei He
Bo Jiang
Zhouhan Lin
136
0
0
01 May 2025
Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions
Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions
Yiming Du
Wenyu Huang
Danna Zheng
Zhaowei Wang
Sébastien Montella
Mirella Lapata
Kam-Fai Wong
Jeff Z. Pan
KELMMU
237
5
0
01 May 2025
KeepKV: Eliminating Output Perturbation in KV Cache Compression for Efficient LLMs Inference
KeepKV: Eliminating Output Perturbation in KV Cache Compression for Efficient LLMs Inference
Yuxuan Tian
Zihan Wang
Yebo Peng
Aomufei Yuan
Zhaoxiang Wang
Bairen Yi
Xin Liu
Yong Cui
Tong Yang
75
0
0
14 Apr 2025
The Method for Storing Patterns in Neural Networks-Memorization and Recall of QR code Patterns-
The Method for Storing Patterns in Neural Networks-Memorization and Recall of QR code Patterns-
Hiroshi Inazawa
30
0
0
09 Apr 2025
Safe Screening Rules for Group OWL Models
Safe Screening Rules for Group OWL Models
Runxue Bao
Quanchao Lu
Yanfu Zhang
110
0
0
04 Apr 2025
InfiniteICL: Breaking the Limit of Context Window Size via Long Short-term Memory Transformation
InfiniteICL: Breaking the Limit of Context Window Size via Long Short-term Memory Transformation
Bowen Cao
Deng Cai
W. Lam
CLL
101
1
0
02 Apr 2025
Long Context Modeling with Ranked Memory-Augmented Retrieval
Long Context Modeling with Ranked Memory-Augmented Retrieval
Ghadir Alselwi
Hao Xue
Shoaib Jameel
Basem Suleiman
Flora D. Salim
Imran Razzak
Imran Razzak
RALM
129
0
0
19 Mar 2025
Sliding Window Attention Training for Efficient Large Language Models
Sliding Window Attention Training for Efficient Large Language Models
Zichuan Fu
Wentao Song
Yansen Wang
X. Wu
Yefeng Zheng
Yingying Zhang
Derong Xu
Xuetao Wei
Tong Xu
Xiangyu Zhao
145
2
0
26 Feb 2025
TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation
TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation
Tong Wu
Junzhe Shen
Zixia Jia
Yanjie Wang
Zilong Zheng
124
1
0
26 Feb 2025
Tokenization is Sensitive to Language Variation
Tokenization is Sensitive to Language Variation
Anna Wegmann
Dong Nguyen
David Jurgens
155
2
0
21 Feb 2025
Neural Attention Search
Neural Attention Search
Difan Deng
Marius Lindauer
146
0
0
21 Feb 2025
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers
Anton Razzhigaev
Matvey Mikhalchuk
Temurbek Rahmatullaev
Elizaveta Goncharova
Polina Druzhinina
Ivan Oseledets
Andrey Kuznetsov
125
5
0
20 Feb 2025
FairKV: Balancing Per-Head KV Cache for Fast Multi-GPU Inference
FairKV: Balancing Per-Head KV Cache for Fast Multi-GPU Inference
Bingzhe Zhao
Ke Cheng
Aomufei Yuan
Yuxuan Tian
Ruiguang Zhong
Chengchen Hu
Tong Yang
Lian Yu
122
0
0
19 Feb 2025
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity
Yuri Kuratov
M. Arkhipov
Aydar Bulatov
Andrey Kravchenko
139
3
0
18 Feb 2025
Associative Recurrent Memory Transformer
Associative Recurrent Memory Transformer
Ivan Rodkin
Yuri Kuratov
Aydar Bulatov
Andrey Kravchenko
134
4
0
17 Feb 2025
LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation
LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation
Zican Dong
Junyi Li
Jinhao Jiang
Mingyu Xu
Wayne Xin Zhao
Bin Wang
Xin Wu
VLM
371
5
0
11 Feb 2025
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
Sumin An
Junyoung Sung
Wonpyo Park
Chanjun Park
Paul Hongsuck Seo
232
0
0
10 Feb 2025
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache
Rishabh Tiwari
Haocheng Xi
Aditya Tomar
Coleman Hooper
Sehoon Kim
Maxwell Horton
Mahyar Najibi
Michael W. Mahoney
Kemal Kurniawan
Amir Gholami
MQ
112
5
0
05 Feb 2025
Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning
Twilight: Adaptive Attention Sparsity with Hierarchical Top-ppp Pruning
C. Lin
Jiaming Tang
Shuo Yang
Hanshuo Wang
Tian Tang
Boyu Tian
Ion Stoica
Enze Xie
Mingyu Gao
180
5
0
04 Feb 2025
Vision-centric Token Compression in Large Language Model
Vision-centric Token Compression in Large Language Model
Ling Xing
Alex Jinpeng Wang
Rui Yan
Xiangbo Shu
Jinhui Tang
VLM
159
0
0
02 Feb 2025
Efficient Language Modeling for Low-Resource Settings with Hybrid RNN-Transformer Architectures
Efficient Language Modeling for Low-Resource Settings with Hybrid RNN-Transformer Architectures
Gabriel Lindenmaier
Sean Papay
Sebastian Padó
154
0
0
02 Feb 2025
Irrational Complex Rotations Empower Low-bit Optimizers
Irrational Complex Rotations Empower Low-bit Optimizers
Zhen Tian
Wayne Xin Zhao
Ji-Rong Wen
MQ
73
0
0
22 Jan 2025
Attention Entropy is a Key Factor: An Analysis of Parallel Context Encoding with Full-attention-based Pre-trained Language Models
Attention Entropy is a Key Factor: An Analysis of Parallel Context Encoding with Full-attention-based Pre-trained Language Models
Zhisong Zhang
Yan Wang
Xinting Huang
Tianqing Fang
Han Zhang
Chenlong Deng
Shuaiyi Li
Dong Yu
150
6
0
21 Dec 2024
Expansion Span: Combining Fading Memory and Retrieval in Hybrid State Space Models
Expansion Span: Combining Fading Memory and Retrieval in Hybrid State Space Models
Elvis Nunez
Luca Zancato
Benjamin Bowman
Aditya Golatkar
Wei Xia
Stefano Soatto
221
4
0
17 Dec 2024
ClusterKV: Manipulating LLM KV Cache in Semantic Space for Recallable Compression
ClusterKV: Manipulating LLM KV Cache in Semantic Space for Recallable Compression
Guangda Liu
Chong Li
Jieru Zhao
Chenqi Zhang
Minyi Guo
117
13
0
04 Dec 2024
Squeezed Attention: Accelerating Long Context Length LLM Inference
Squeezed Attention: Accelerating Long Context Length LLM Inference
Coleman Hooper
Sehoon Kim
Hiva Mohammadzadeh
Monishwaran Maheswaran
June Paik
Michael W. Mahoney
Kemal Kurniawan
Amir Gholami
Amir Gholami
178
16
0
14 Nov 2024
Are LLMs Prescient? A Continuous Evaluation using Daily News as the Oracle
Are LLMs Prescient? A Continuous Evaluation using Daily News as the Oracle
Hui Dai
Ryan Teehan
Mengye Ren
KELMAIFinELM
47
1
0
13 Nov 2024
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
Wei Wu
Zhuoshi Pan
Chao Wang
L. Chen
Y. Bai
Kun Fu
Zehua Wang
Hui Xiong
Hui Xiong
LLMAG
178
7
0
05 Nov 2024
Human-inspired Perspectives: A Survey on AI Long-term Memory
Human-inspired Perspectives: A Survey on AI Long-term Memory
Zihong He
Weizhe Lin
Hao Zheng
Fan Zhang
Matt Jones
Laurence Aitchison
X. Xu
Miao Liu
Per Ola Kristensson
Junxiao Shen
246
3
0
01 Nov 2024
What is Wrong with Perplexity for Long-context Language Modeling?
What is Wrong with Perplexity for Long-context Language Modeling?
Lizhe Fang
Yifei Wang
Zhaoyang Liu
Chenheng Zhang
Stefanie Jegelka
Jinyang Gao
Bolin Ding
Yisen Wang
157
13
0
31 Oct 2024
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
Hanshi Sun
Li-Wen Chang
Yiyuan Ma
Wenlei Bao
Ningxin Zheng
Xin Liu
Harry Dong
Yuejie Chi
Beidi Chen
VLM
165
21
0
28 Oct 2024
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
Sangmin Bae
Adam Fisch
Hrayr Harutyunyan
Ziwei Ji
Seungyeon Kim
Tal Schuster
KELM
137
7
0
28 Oct 2024
Long Sequence Modeling with Attention Tensorization: From Sequence to Tensor Learning
Long Sequence Modeling with Attention Tensorization: From Sequence to Tensor Learning
Aosong Feng
Rex Ying
Leandros Tassiulas
58
2
0
28 Oct 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Liwen Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
184
7
0
24 Oct 2024
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Yizhao Gao
Zhichen Zeng
Dayou Du
Shijie Cao
Hayden Kwok-Hay So
...
Junjie Lai
Mao Yang
Ting Cao
Fan Yang
M. Yang
148
28
0
17 Oct 2024
12345
Next