ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.15595
  4. Cited By
Extending Context Window of Large Language Models via Positional
  Interpolation

Extending Context Window of Large Language Models via Positional Interpolation

27 June 2023
Shouyuan Chen
Sherman Wong
Liangjian Chen
Yuandong Tian
ArXivPDFHTML

Papers citing "Extending Context Window of Large Language Models via Positional Interpolation"

50 / 388 papers shown
Title
Long-Context Language Modeling with Parallel Context Encoding
Long-Context Language Modeling with Parallel Context Encoding
Howard Yen
Tianyu Gao
Danqi Chen
40
43
0
26 Feb 2024
MemoryPrompt: A Light Wrapper to Improve Context Tracking in Pre-trained
  Language Models
MemoryPrompt: A Light Wrapper to Improve Context Tracking in Pre-trained Language Models
Nathanaël Carraz Rakotonirina
Marco Baroni
VLM
KELM
35
0
0
23 Feb 2024
Beyond A*: Better Planning with Transformers via Search Dynamics
  Bootstrapping
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
Lucas Lehnert
Sainbayar Sukhbaatar
DiJia Su
Qinqing Zheng
Paul Mcvay
Michael Rabbat
Yuandong Tian
37
54
0
21 Feb 2024
$\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens
∞\infty∞Bench: Extending Long Context Evaluation Beyond 100K Tokens
Xinrong Zhang
Yingfa Chen
Shengding Hu
Zihang Xu
Junhao Chen
...
Xu Han
Zhen Leng Thai
Shuo Wang
Zhiyuan Liu
Maosong Sun
RALM
LRM
50
154
0
21 Feb 2024
User-LLM: Efficient LLM Contextualization with User Embeddings
User-LLM: Efficient LLM Contextualization with User Embeddings
Lin Ning
Luyang Liu
Jiaxing Wu
Neo Wu
D. Berlowitz
Sushant Prakash
Bradley Green
S. O’Banion
Jun Xie
59
34
0
21 Feb 2024
LongWanjuan: Towards Systematic Measurement for Long Text Quality
LongWanjuan: Towards Systematic Measurement for Long Text Quality
Kai Lv
Xiaoran Liu
Qipeng Guo
Hang Yan
Conghui He
Xipeng Qiu
Dahua Lin
33
4
0
21 Feb 2024
Fine-Grained Modeling of Narrative Context: A Coherence Perspective via
  Retrospective Questions
Fine-Grained Modeling of Narrative Context: A Coherence Perspective via Retrospective Questions
Liyan Xu
JiangNan Li
Mo Yu
Jie Zhou
41
3
0
21 Feb 2024
AnaloBench: Benchmarking the Identification of Abstract and Long-context
  Analogies
AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies
Xiao Ye
Andrew Wang
Jacob Choi
Yining Lu
Shreya Sharma
Lingfeng Shen
Vijay Tiyyala
Nicholas Andrews
Daniel Khashabi
ELM
44
8
0
19 Feb 2024
LVCHAT: Facilitating Long Video Comprehension
LVCHAT: Facilitating Long Video Comprehension
Yu Wang
Zeyuan Zhang
Julian McAuley
Zexue He
VLM
32
4
0
19 Feb 2024
Extensible Embedding: A Flexible Multipler For LLM's Context Length
Extensible Embedding: A Flexible Multipler For LLM's Context Length
Ninglu Shao
Shitao Xiao
Zheng Liu
Peitian Zhang
32
1
0
18 Feb 2024
BGE Landmark Embedding: A Chunking-Free Embedding Method For Retrieval
  Augmented Long-Context Large Language Models
BGE Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language Models
Kun Luo
Zheng Liu
Shitao Xiao
Kang Liu
41
11
0
18 Feb 2024
LongAgent: Scaling Language Models to 128k Context through Multi-Agent
  Collaboration
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
Jun Zhao
Can Zu
Haotian Xu
Yi Lu
Wei He
Yiwen Ding
Tao Gui
Qi Zhang
Xuanjing Huang
RALM
LLMAG
47
22
0
18 Feb 2024
LongHeads: Multi-Head Attention is Secretly a Long Context Processor
LongHeads: Multi-Head Attention is Secretly a Long Context Processor
Yi Lu
Xin Zhou
Wei He
Jun Zhao
Tao Ji
Tao Gui
Qi Zhang
Xuanjing Huang
LLMAG
50
11
0
16 Feb 2024
BitDelta: Your Fine-Tune May Only Be Worth One Bit
BitDelta: Your Fine-Tune May Only Be Worth One Bit
James Liu
Guangxuan Xiao
Kai Li
Jason D. Lee
Song Han
Tri Dao
Tianle Cai
45
21
0
15 Feb 2024
Data Engineering for Scaling Language Models to 128K Context
Data Engineering for Scaling Language Models to 128K Context
Yao Fu
Yikang Shen
Xinyao Niu
Xiang Yue
Hanna Hajishirzi
Yoon Kim
Hao-Chun Peng
MoE
50
124
0
15 Feb 2024
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Kuang-Huei Lee
Xinyun Chen
Hiroki Furuta
John F. Canny
Ian S. Fischer
RALM
55
30
0
15 Feb 2024
Transformers Can Achieve Length Generalization But Not Robustly
Transformers Can Achieve Length Generalization But Not Robustly
Yongchao Zhou
Uri Alon
Xinyun Chen
Xuezhi Wang
Rishabh Agarwal
Denny Zhou
52
36
0
14 Feb 2024
Leveraging the Context through Multi-Round Interactions for Jailbreaking
  Attacks
Leveraging the Context through Multi-Round Interactions for Jailbreaking Attacks
Yixin Cheng
Markos Georgopoulos
V. Cevher
Grigorios G. Chrysos
AAML
27
15
0
14 Feb 2024
World Model on Million-Length Video And Language With Blockwise RingAttention
World Model on Million-Length Video And Language With Blockwise RingAttention
Hao Liu
Wilson Yan
Matei A. Zaharia
Pieter Abbeel
VGen
39
64
0
13 Feb 2024
Lissard: Long and Simple Sequential Reasoning Datasets
Lissard: Long and Simple Sequential Reasoning Datasets
M. Bueno
R. Lotufo
Rodrigo Nogueira
RALM
LRM
33
2
0
12 Feb 2024
MEMORYLLM: Towards Self-Updatable Large Language Models
MEMORYLLM: Towards Self-Updatable Large Language Models
Yu Wang
Yifan Gao
Xiusi Chen
Haoming Jiang
Shiyang Li
...
Zheng Li
Xian Li
Bing Yin
Jingbo Shang
Julian McAuley
KELM
37
17
0
07 Feb 2024
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an
  Efficient Context Memory
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
Chaojun Xiao
Pengle Zhang
Xu Han
Guangxuan Xiao
Yankai Lin
Zhengyan Zhang
Zhiyuan Liu
Maosong Sun
LLMAG
47
35
0
07 Feb 2024
The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax
  Mimicry
The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry
Michael Zhang
Kush S. Bhatia
Hermann Kumbong
Christopher Ré
35
48
0
06 Feb 2024
Nevermind: Instruction Override and Moderation in Large Language Models
Nevermind: Instruction Override and Moderation in Large Language Models
Edward Kim
ALM
26
0
0
05 Feb 2024
UniMem: Towards a Unified View of Long-Context Large Language Models
UniMem: Towards a Unified View of Long-Context Large Language Models
Junjie Fang
Likai Tang
Hongzhe Bi
Yujia Qin
Si Sun
...
Xiaodong Shi
Sen Song
Yankai Lin
Zhiyuan Liu
Maosong Sun
27
3
0
05 Feb 2024
Beyond the Limits: A Survey of Techniques to Extend the Context Length
  in Large Language Models
Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models
Xindi Wang
Mahsa Salmani
Parsa Omidi
Xiangyu Ren
Mehdi Rezagholizadeh
A. Eshaghi
LRM
39
36
0
03 Feb 2024
Nomic Embed: Training a Reproducible Long Context Text Embedder
Nomic Embed: Training a Reproducible Long Context Text Embedder
Zach Nussbaum
John X. Morris
Brandon Duderstadt
Andriy Mulyar
27
100
0
02 Feb 2024
BlackMamba: Mixture of Experts for State-Space Models
BlackMamba: Mixture of Experts for State-Space Models
Quentin G. Anthony
Yury Tokpanov
Paolo Glorioso
Beren Millidge
38
21
0
01 Feb 2024
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache
  Quantization
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Coleman Hooper
Sehoon Kim
Hiva Mohammadzadeh
Michael W. Mahoney
Y. Shao
Kurt Keutzer
A. Gholami
MQ
25
181
0
31 Jan 2024
LongAlign: A Recipe for Long Context Alignment of Large Language Models
LongAlign: A Recipe for Long Context Alignment of Large Language Models
Yushi Bai
Xin Lv
Jiajie Zhang
Yuze He
Ji Qi
Lei Hou
Jie Tang
Yuxiao Dong
Juanzi Li
ALM
42
46
0
31 Jan 2024
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length
  Extrapolation
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation
Zhenyu He
Guhao Feng
Shengjie Luo
Kai-Bo Yang
Liwei Wang
Jingjing Xu
Zhi Zhang
Hongxia Yang
Di He
32
14
0
29 Jan 2024
PROXYQA: An Alternative Framework for Evaluating Long-Form Text
  Generation with Large Language Models
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models
Haochen Tan
Zhijiang Guo
Zhan Shi
Lu Xu
Zhili Liu
...
Xiaoguang Li
Yasheng Wang
Lifeng Shang
Qun Liu
Linqi Song
48
12
0
26 Jan 2024
DeepSeek-Coder: When the Large Language Model Meets Programming -- The
  Rise of Code Intelligence
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Daya Guo
Qihao Zhu
Dejian Yang
Zhenda Xie
Kai Dong
...
Yu-Huan Wu
Y. K. Li
Fuli Luo
Yingfei Xiong
W. Liang
ELM
62
695
0
25 Jan 2024
With Greater Text Comes Greater Necessity: Inference-Time Training Helps
  Long Text Generation
With Greater Text Comes Greater Necessity: Inference-Time Training Helps Long Text Generation
Y. Wang
D. Ma
D. Cai
RALM
49
19
0
21 Jan 2024
RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large
  Language Models in Tool Learning
RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning
Junjie Ye
Yilong Wu
Songyang Gao
Caishuang Huang
Sixian Li
Guanyu Li
Xiaoran Fan
Qi Zhang
Tao Gui
Xuanjing Huang
AAML
35
16
0
16 Jan 2024
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
Zongxin Yang
Guikun Chen
Xiaodi Li
Wenguan Wang
Yi Yang
LM&Ro
LLMAG
69
35
0
16 Jan 2024
The Chronicles of RAG: The Retriever, the Chunk and the Generator
The Chronicles of RAG: The Retriever, the Chunk and the Generator
Paulo Finardi
Leonardo Avila
Rodrigo Castaldoni
P. Gengo
Celio H. N. Larcher
Marcos Piau
Pablo B. Costa
Vinicius Fernandes Caridá
RALM
22
29
0
15 Jan 2024
The What, Why, and How of Context Length Extension Techniques in Large
  Language Models -- A Detailed Survey
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
Saurav Pawar
S.M. Towhidul Islam Tonmoy
S. M. M. Zaman
Vinija Jain
Aman Chadha
Amitava Das
42
28
0
15 Jan 2024
Flexibly Scaling Large Language Models Contexts Through Extensible
  Tokenization
Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization
Ninglu Shao
Shitao Xiao
Zheng Liu
Peitian Zhang
36
4
0
15 Jan 2024
Extending LLMs' Context Window with 100 Samples
Extending LLMs' Context Window with 100 Samples
Yikai Zhang
Junlong Li
Pengfei Liu
37
11
0
13 Jan 2024
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
Jiaheng Liu
Zhiqi Bai
Yuanxing Zhang
Chenchen Zhang
Yu Zhang
...
Wenbo Su
Tiezheng Ge
Jie Fu
Wenhu Chen
Bo Zheng
48
8
0
13 Jan 2024
Transformers are Multi-State RNNs
Transformers are Multi-State RNNs
Matanel Oren
Michael Hassid
Nir Yarden
Yossi Adi
Roy Schwartz
OffRL
32
37
0
11 Jan 2024
Attendre: Wait To Attend By Retrieval With Evicted Queries in
  Memory-Based Transformers for Long Context Processing
Attendre: Wait To Attend By Retrieval With Evicted Queries in Memory-Based Transformers for Long Context Processing
Zi Yang
Nan Hua
RALM
49
4
0
10 Jan 2024
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence
  Lengths in Large Language Models
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Zhen Qin
Weigao Sun
Dong Li
Xuyang Shen
Weixuan Sun
Yiran Zhong
72
22
0
09 Jan 2024
TeleChat Technical Report
TeleChat Technical Report
Zhongjiang He
Zihan Wang
Xinzhan Liu
Shixuan Liu
Yitong Yao
...
Zilu Huang
Sishi Xiong
Yuxiang Zhang
Chao Wang
Shuangyong Song
AI4MH
LRM
ALM
66
3
0
08 Jan 2024
GRAM: Global Reasoning for Multi-Page VQA
GRAM: Global Reasoning for Multi-Page VQA
Tsachi Blau
Sharon Fogel
Roi Ronen
Alona Golts
Roy Ganz
Elad Ben Avraham
Aviad Aberdam
Shahar Tsiper
Ron Litman
22
12
0
07 Jan 2024
AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse
  Datasets
AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets
Ernest Perkowski
Rui Pan
Tuan Dung Nguyen
Yuan-Sen Ting
Sandor Kruk
...
Michael J. Smith
Huiling Liu
Kevin Schawinski
K. Iyer
I. Ciucă
AI4MH
20
12
0
03 Jan 2024
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Hongye Jin
Xiaotian Han
Jingfeng Yang
Zhimeng Jiang
Zirui Liu
Chia-Yuan Chang
Huiyuan Chen
Xia Hu
47
101
0
02 Jan 2024
Structured Packing in LLM Training Improves Long Context Utilization
Structured Packing in LLM Training Improves Long Context Utilization
Konrad Staniszewski
Szymon Tworkowski
Sebastian Jaszczur
Yu Zhao
Henryk Michalewski
Lukasz Kuciñski
Piotr Milo's
41
13
0
28 Dec 2023
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile
  Devices
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices
Xiangxiang Chu
Limeng Qiao
Xinyang Lin
Shuang Xu
Yang Yang
...
Fei Wei
Xinyu Zhang
Bo Zhang
Xiaolin Wei
Chunhua Shen
MLLM
44
35
0
28 Dec 2023
Previous
12345678
Next