Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.15595
Cited By
Extending Context Window of Large Language Models via Positional Interpolation
27 June 2023
Shouyuan Chen
Sherman Wong
Liangjian Chen
Yuandong Tian
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Extending Context Window of Large Language Models via Positional Interpolation"
50 / 388 papers shown
Title
LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation
Zican Dong
Junyi Li
Jinhao Jiang
Mingyu Xu
Wayne Xin Zhao
Bin Wang
Xin Wu
VLM
213
4
0
20 Feb 2025
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
Cheng Luo
Zefan Cai
Hanshi Sun
Jinqi Xiao
Bo Yuan
Wen Xiao
Junjie Hu
Jiawei Zhao
Beidi Chen
Anima Anandkumar
69
1
0
18 Feb 2025
Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning
Qifan Yu
Zhenyu He
Sijie Li
Xun Zhou
Jun Zhang
Jingjing Xu
Di He
OffRL
LRM
89
5
0
12 Feb 2025
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
Sumin An
Junyoung Sung
Wonpyo Park
Chanjun Park
Paul Hongsuck Seo
102
0
0
10 Feb 2025
Large Language Models for In-File Vulnerability Localization Can Be "Lost in the End"
Francesco Sovrano
Adam Bauer
Alberto Bacchelli
54
1
0
09 Feb 2025
Can LLMs Maintain Fundamental Abilities under KV Cache Compression?
Xiang Liu
Zhenheng Tang
Hong Chen
Peijie Dong
Zeyu Li
Xiuze Zhou
Bo Li
Xuming Hu
Xiaowen Chu
245
4
0
04 Feb 2025
Context-Aware Hierarchical Merging for Long Document Summarization
Litu Ou
Mirella Lapata
MoMe
271
1
0
03 Feb 2025
SEAL: Scaling to Emphasize Attention for Long-Context Retrieval
Changhun Lee
Jun-gyu Jin
Younghyun Cho
Eunhyeok Park
LRM
56
0
0
28 Jan 2025
LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Zhan Ling
Kang Liu
Kai Yan
Yuqing Yang
Weijian Lin
Ting-Han Fan
Lingfeng Shen
Zhengyin Du
Jiecao Chen
ReLM
ELM
LRM
52
3
0
25 Jan 2025
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
Jianing Yang
Alexander Sax
Kevin J Liang
Mikael Henaff
Hao Tang
Ang Cao
J. Chai
Franziska Meier
Matt Feiszli
3DGS
81
16
0
23 Jan 2025
NExtLong: Toward Effective Long-Context Training without Long Documents
Chaochen Gao
Xing Wu
Zijia Lin
Debing Zhang
Songlin Hu
SyDa
75
1
0
22 Jan 2025
ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
Thibaut Thonet
Jos Rozen
Laurent Besacier
RALM
145
2
0
20 Jan 2025
Visual RAG: Expanding MLLM visual knowledge without fine-tuning
Mirco Bonomo
Simone Bianco
VLM
77
5
0
18 Jan 2025
Guiding Retrieval using LLM-based Listwise Rankers
Mandeep Rathee
Sean MacAvaney
Avishek Anand
KELM
LRM
73
4
0
17 Jan 2025
ListConRanker: A Contrastive Text Reranker with Listwise Encoding
Junlong Liu
Yue Ma
Ruihui Zhao
Junhao Zheng
Qianli Ma
Yangyang Kang
49
0
0
13 Jan 2025
Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
Hadi Pouransari
Chun-Liang Li
Jen-Hao Rick Chang
Pavan Kumar Anasosalu Vasu
Cem Koc
Vaishaal Shankar
Oncel Tuzel
42
8
0
08 Jan 2025
Lost-in-Distance: Impact of Contextual Proximity on LLM Performance in Graph Tasks
Hamed Firooz
Maziar Sanjabi
Wenlong Jiang
Xiaoling Zhai
73
3
0
03 Jan 2025
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
Jiajun Zhu
Peihao Wang
Ruisi Cai
Jason D. Lee
Pan Li
Zhendong Wang
KELM
53
1
0
03 Jan 2025
A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression
Chenlong Deng
Zhisong Zhang
Kelong Mao
Shuaiyi Li
Xinting Huang
Dong Yu
Zhicheng Dou
42
1
0
23 Dec 2024
Investigating Length Issues in Document-level Machine Translation
Ziqian Peng
Rachel Bawden
François Yvon
71
1
0
23 Dec 2024
Expansion Span: Combining Fading Memory and Retrieval in Hybrid State Space Models
Elvis Nunez
L. Zancato
Benjamin Bowman
Aditya Golatkar
W. Xia
Stefano Soatto
88
2
0
17 Dec 2024
From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
Tianwei Yin
Qiang Zhang
Richard Zhang
William T. Freeman
F. Durand
Eli Shechtman
Xun Huang
VGen
DiffM
86
5
0
10 Dec 2024
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models
Haoran Lian
Junmin Chen
Wei Huang
Yizhe Xiong
Wenping Hu
...
Hui Chen
Jianwei Niu
Zijia Lin
Fuzheng Zhang
Di Zhang
86
0
0
10 Dec 2024
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHRs
Michael Wornow
Suhana Bedi
Miguel Angel Fuentes Hernandez
E. Steinberg
Jason Alan Fries
Christopher Ré
Sanmi Koyejo
N. Shah
100
4
0
09 Dec 2024
Rank It, Then Ask It: Input Reranking for Maximizing the Performance of LLMs on Symmetric Tasks
Mohsen Dehghankar
Abolfazl Asudeh
69
1
0
30 Nov 2024
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
Haonan Wang
Qian Liu
Chao Du
Tongyao Zhu
Cunxiao Du
Kenji Kawaguchi
Tianyu Pang
115
6
0
20 Nov 2024
Reducing Distraction in Long-Context Language Models by Focused Learning
Zijun Wu
Bingyuan Liu
Ran Yan
Lei Chen
Thomas Delteil
RALM
44
2
0
08 Nov 2024
The Future of Intelligent Healthcare: A Systematic Analysis and Discussion on the Integration and Impact of Robots Using Large Language Models for Healthcare
Souren Pashangpour
Goldie Nejat
LM&MA
53
7
0
05 Nov 2024
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
Wei Wu
Zhuoshi Pan
Chao Wang
L. Chen
Y. Bai
Kun Fu
Zehua Wang
Hui Xiong
Hui Xiong
LLMAG
44
5
0
05 Nov 2024
TeleOracle: Fine-Tuned Retrieval-Augmented Generation with Long-Context Support for Network
Nouf Alabbasi
Omar Erak
Omar Alhussein
Ismail Lotfi
Sami Muhaidat
Merouane Debbah
RALM
231
0
0
04 Nov 2024
Fashion-VDM: Video Diffusion Model for Virtual Try-On
J. Karras
Yingwei Li
Nan Liu
Luyang Zhu
Innfarn Yoo
Andreas Lugmayr
Chris Lee
Ira Kemelmacher-Shlizerman
DiffM
VGen
40
4
0
31 Oct 2024
What is Wrong with Perplexity for Long-context Language Modeling?
Lizhe Fang
Yifei Wang
Zhaoyang Liu
Chenheng Zhang
Stefanie Jegelka
Jinyang Gao
Bolin Ding
Yisen Wang
69
6
0
31 Oct 2024
Understanding Synthetic Context Extension via Retrieval Heads
Xinyu Zhao
Fangcong Yin
Greg Durrett
41
0
0
29 Oct 2024
HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Yuhan Chen
Ang Lv
Jian Luan
Bin Wang
Wen Liu
41
4
0
28 Oct 2024
Long Sequence Modeling with Attention Tensorization: From Sequence to Tensor Learning
Aosong Feng
Rex Ying
Leandros Tassiulas
32
2
0
28 Oct 2024
Two are better than one: Context window extension with multi-grained self-injection
Wei Han
Pan Zhou
Soujanya Poria
Shuicheng Yan
29
0
0
25 Oct 2024
LOGO -- Long cOntext aliGnment via efficient preference Optimization
Zecheng Tang
Zechen Sun
Juntao Li
Qiaoming Zhu
Min Zhang
37
1
0
24 Oct 2024
LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering
Qingfei Zhao
Ruobing Wang
Yukuo Cen
Daren Zha
Shicheng Tan
Yuxiao Dong
Jie Tang
RALM
49
9
0
23 Oct 2024
ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
Taewhoo Lee
Chanwoong Yoon
Kyochul Jang
Donghyeon Lee
Minju Song
Hyunjae Kim
Jaewoo Kang
ELM
35
1
0
22 Oct 2024
Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs
Xin Ma
Yang Liu
Jiaheng Liu
Xiaoxu Ma
31
1
0
21 Oct 2024
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model
ZiDong Wang
Zeyu Lu
Di Huang
Cai Zhou
Wanli Ouyang
and Lei Bai
76
3
0
17 Oct 2024
An Evolved Universal Transformer Memory
Edoardo Cetin
Qi Sun
Tianyu Zhao
Yujin Tang
221
0
0
17 Oct 2024
How much do contextualized representations encode long-range context?
Simeng Sun
Cheng-Ping Hsieh
48
0
0
16 Oct 2024
In-Context Learning for Long-Context Sentiment Analysis on Infrastructure Project Opinions
Alireza Shamshiri
Kyeong Rok Ryu
June Young Park
LLMAG
24
1
0
15 Oct 2024
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Ziyue Li
Dinesh Manocha
MoE
74
6
0
14 Oct 2024
TULIP: Token-length Upgraded CLIP
Ivona Najdenkoska
Mohammad Mahdi Derakhshani
Yuki M. Asano
Nanne van Noord
Marcel Worring
Cees G. M. Snoek
VLM
50
3
0
13 Oct 2024
On the token distance modeling ability of higher RoPE attention dimension
Xiangyu Hong
Che Jiang
Biqing Qi
Fandong Meng
Mo Yu
Bowen Zhou
Jie Zhou
44
4
0
11 Oct 2024
InAttention: Linear Context Scaling for Transformers
Joseph Eisner
26
0
0
09 Oct 2024
FltLM: An Intergrated Long-Context Large Language Model for Effective Context Filtering and Understanding
Jingyang Deng
Zhengyang Shen
Boyang Wang
Lixin Su
Suqi Cheng
Ying Nie
Junfeng Wang
Dawei Yin
Jinwen Ma
39
1
0
09 Oct 2024
SEGMENT+: Long Text Processing with Short-Context Language Models
Wei Shi
Shuang Li
Kerun Yu
Jinglei Chen
Zujie Liang
...
Feng Wei
Bo Zheng
Jiaqing Liang
Jiangjie Chen
Yanghua Xiao
RALM
VLM
57
2
0
09 Oct 2024
Previous
1
2
3
4
5
6
7
8
Next