Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.15595
Cited By
v1
v2 (latest)
Extending Context Window of Large Language Models via Positional Interpolation
27 June 2023
Shouyuan Chen
Sherman Wong
Liangjian Chen
Yuandong Tian
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Extending Context Window of Large Language Models via Positional Interpolation"
50 / 117 papers shown
Title
Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
Hadi Pouransari
Chun-Liang Li
Jen-Hao Rick Chang
Pavan Kumar Anasosalu Vasu
Cem Koc
Vaishaal Shankar
Oncel Tuzel
95
11
0
08 Jan 2025
Lost-in-Distance: Impact of Contextual Proximity on LLM Performance in Graph Tasks
Hamed Firooz
Maziar Sanjabi
Wenlong Jiang
Xiaoling Zhai
144
3
0
03 Jan 2025
Investigating Length Issues in Document-level Machine Translation
Ziqian Peng
Rachel Bawden
François Yvon
108
2
0
23 Dec 2024
Expansion Span: Combining Fading Memory and Retrieval in Hybrid State Space Models
Elvis Nunez
Luca Zancato
Benjamin Bowman
Aditya Golatkar
Wei Xia
Stefano Soatto
219
4
0
17 Dec 2024
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHRs
Michael Wornow
Suhana Bedi
Miguel Angel Fuentes Hernandez
E. Steinberg
Jason Alan Fries
Christopher Ré
Sanmi Koyejo
N. Shah
246
6
0
09 Dec 2024
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
Wei Wu
Zhuoshi Pan
Chao Wang
L. Chen
Y. Bai
Kun Fu
Zehua Wang
Hui Xiong
Hui Xiong
LLMAG
178
7
0
05 Nov 2024
What is Wrong with Perplexity for Long-context Language Modeling?
Lizhe Fang
Yifei Wang
Zhaoyang Liu
Chenheng Zhang
Stefanie Jegelka
Jinyang Gao
Bolin Ding
Yisen Wang
157
13
0
31 Oct 2024
Understanding Synthetic Context Extension via Retrieval Heads
Xinyu Zhao
Fangcong Yin
Greg Durrett
161
2
0
29 Oct 2024
Long Sequence Modeling with Attention Tensorization: From Sequence to Tensor Learning
Aosong Feng
Rex Ying
Leandros Tassiulas
58
2
0
28 Oct 2024
ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
Taewhoo Lee
Chanwoong Yoon
Kyochul Jang
Donghyeon Lee
Minju Song
Hyunjae Kim
Jaewoo Kang
ELM
80
1
0
22 Oct 2024
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model
ZiDong Wang
Zeyu Lu
Di Huang
Cai Zhou
Wanli Ouyang
and Lei Bai
126
6
0
17 Oct 2024
An Evolved Universal Transformer Memory
Edoardo Cetin
Qi Sun
Tianyu Zhao
Yujin Tang
507
0
0
17 Oct 2024
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Ziyue Li
Dinesh Manocha
MoE
159
19
0
14 Oct 2024
TULIP: Token-length Upgraded CLIP
Ivona Najdenkoska
Mohammad Mahdi Derakhshani
Yuki M. Asano
Nanne van Noord
Marcel Worring
Cees G. M. Snoek
VLM
143
4
0
13 Oct 2024
PECAN: LLM-Guided Dynamic Progress Control with Attention-Guided Hierarchical Weighted Graph for Long-Document QA
Xinyu Wang
Yanzheng Xiang
Lin Gui
Yulan He
84
2
0
07 Oct 2024
Accelerating Inference of Networks in the Frequency Domain
Chenqiu Zhao
Guanfang Dong
Anup Basu
122
20
0
06 Oct 2024
LongGenBench: Long-context Generation Benchmark
Xiang Liu
Peijie Dong
Xuming Hu
Xiaowen Chu
RALM
103
9
0
05 Oct 2024
How to Train Long-Context Language Models (Effectively)
Tianyu Gao
Alexander Wettig
Howard Yen
Danqi Chen
RALM
202
48
0
03 Oct 2024
HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly
Howard Yen
Tianyu Gao
Minmin Hou
Ke Ding
Daniel Fleischer
Peter Izsak
Moshe Wasserblat
Danqi Chen
ALM
ELM
143
37
0
03 Oct 2024
Auto-Demo Prompting: Leveraging Generated Outputs as Demonstrations for Enhanced Batch Prompting
Longyu Feng
Mengze Hong
Chen Jason Zhang
80
2
0
02 Oct 2024
Extending Context Window of Large Language Models from a Distributional Perspective
Yingsheng Wu
Yuxuan Gu
Xiaocheng Feng
Weihong Zhong
Dongliang Xu
Qing Yang
Hongtao Liu
Bing Qin
42
2
0
02 Oct 2024
Visual Context Window Extension: A New Perspective for Long Video Understanding
Hongchen Wei
Zhenzhong Chen
VLM
88
6
0
30 Sep 2024
PecSched: Preemptive and Efficient Cluster Scheduling for LLM Inference
Zeyu Zhang
Haiying Shen
VLM
80
1
0
23 Sep 2024
Flash STU: Fast Spectral Transform Units
Y. Isabel Liu
Windsor Nguyen
Yagiz Devre
Evan Dogariu
Anirudha Majumdar
Elad Hazan
AI4TS
156
1
0
16 Sep 2024
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Peng Xu
Ming-Yu Liu
Xianchao Wu
Zihan Liu
Mohammad Shoeybi
Mohammad Shoeybi
Bryan Catanzaro
RALM
162
21
0
19 Jul 2024
Human-like Episodic Memory for Infinite Context LLMs
Zafeirios Fountas
Martin A Benfeghoul
Adnan Oomerjee
Fenia Christopoulou
Gerasimos Lampouras
Haitham Bou-Ammar
Jun Wang
88
21
0
12 Jul 2024
Automated Peer Reviewing in Paper SEA: Standardization, Evaluation, and Analysis
Jianxiang Yu
Zichen Ding
Jiaqi Tan
Kangyang Luo
Zhenmin Weng
...
Chengcheng Han
Qiushi Sun
Zhiyong Wu
Yunshi Lan
Xiang Li
79
6
0
09 Jul 2024
The Structure of Financial Equity Research Reports -- Identification of the Most Frequently Asked Questions in Financial Analyst Reports to Automate Equity Research Using Llama 3 and GPT-4
Adria Pop
Jan Spörer
14
0
0
04 Jul 2024
UIO-LLMs: Unbiased Incremental Optimization for Long-Context LLMs
Wenhao Li
Mingbao Lin
Mingliang Xu
Shuicheng Yan
Rongrong Ji
71
0
0
26 Jun 2024
Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA
Minzheng Wang
Longze Chen
Cheng Fu
Shengyi Liao
Xinghua Zhang
...
Run Luo
Yunshui Li
Min Yang
Fei Huang
Yongbin Li
RALM
105
60
0
25 Jun 2024
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs
Ziyan Jiang
Xueguang Ma
Wenhu Chen
RALM
135
59
0
21 Jun 2024
Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell
Taiming Lu
Muhan Gao
Kuai Yu
Adam Byerly
Daniel Khashabi
111
17
0
20 Jun 2024
DeciMamba: Exploring the Length Extrapolation Potential of Mamba
Assaf Ben-Kish
Itamar Zimerman
Shady Abu Hussein
Nadav Cohen
Amir Globerson
Lior Wolf
Raja Giryes
Mamba
206
20
0
20 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
182
69
0
11 Jun 2024
Evaluating Zero-Shot Long-Context LLM Compression
Chenyu Wang
Yihan Wang
Kai Li
124
0
0
10 Jun 2024
LongSSM: On the Length Extension of State-space Models in Language Modelling
Shida Wang
93
1
0
04 Jun 2024
Toward Conversational Agents with Context and Time Sensitive Long-term Memory
Nick Alonso
Tomás Figliolia
A. Ndirango
Beren Millidge
RALM
3DV
109
3
0
29 May 2024
XL3M: A Training-free Framework for LLM Length Extension Based on Segment-wise Inference
Shengnan Wang
Youhui Bai
Lin Zhang
Pingyi Zhou
Shixiong Zhao
Gong Zhang
Sen Wang
Renhai Chen
Hua Xu
Hongwei Sun
128
5
0
28 May 2024
Base of RoPE Bounds Context Length
Xin Men
Mingyu Xu
Bingning Wang
Qingyu Zhang
Hongyu Lin
Xianpei Han
Weipeng Chen
101
26
0
23 May 2024
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
Peng Gao
Le Zhuo
Ziyi Lin
Ruoyi Du
Xu Luo
...
Weicai Ye
He Tong
Jingwen He
Yu Qiao
Hongsheng Li
VGen
103
91
0
09 May 2024
Long Context Alignment with Short Instructions and Synthesized Positions
Wenhao Wu
Yizhong Wang
Yao Fu
Xiang Yue
Dawei Zhu
Sujian Li
SyDa
80
19
0
07 May 2024
In-Context Learning with Long-Context Models: An In-Depth Exploration
Amanda Bertsch
Maor Ivgi
Uri Alon
Jonathan Berant
Matthew R. Gormley
Matthew R. Gormley
Graham Neubig
ReLM
AIMat
189
80
0
30 Apr 2024
Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation
Thomas Merth
Qichen Fu
Mohammad Rastegari
Mahyar Najibi
LRM
RALM
102
10
0
10 Apr 2024
A Survey on Large Language Model-Based Game Agents
Sihao Hu
Tiansheng Huang
Gaowen Liu
Ramana Rao Kompella
Gaowen Liu
Selim Furkan Tekin
Yichang Xu
Zachary Yahn
Ling Liu
LLMAG
LM&Ro
AI4CE
LM&MA
231
57
0
02 Apr 2024
AIOS: LLM Agent Operating System
Kai Mei
Zelong Li
Wujiang Xu
Wenyue Hua
Mingyu Jin
Yongfeng Zhang
Shuyuan Xu
Ruosong Ye
Yingqiang Ge
Yongfeng Zhang
LLMAG
149
25
0
25 Mar 2024
NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens
Cunxiang Wang
Ruoxi Ning
Boqi Pan
Tonghui Wu
Qipeng Guo
...
Guangsheng Bao
Xiangkun Hu
Zheng Zhang
Qian Wang
Yue Zhang
RALM
235
11
0
18 Mar 2024
Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding
Zhenyu Zhang
Runjin Chen
Shiwei Liu
Zhewei Yao
Olatunji Ruwase
Beidi Chen
Xiaoxia Wu
Zhangyang Wang
95
36
0
05 Mar 2024
Transformers Can Achieve Length Generalization But Not Robustly
Yongchao Zhou
Uri Alon
Xinyun Chen
Xuezhi Wang
Rishabh Agarwal
Denny Zhou
120
43
0
14 Feb 2024
World Model on Million-Length Video And Language With Blockwise RingAttention
Hao Liu
Wilson Yan
Matei A. Zaharia
Pieter Abbeel
VGen
140
85
0
13 Feb 2024
Nomic Embed: Training a Reproducible Long Context Text Embedder
Zach Nussbaum
John X. Morris
Brandon Duderstadt
Andriy Mulyar
118
124
0
02 Feb 2024
Previous
1
2
3
Next