ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.00071
  4. Cited By
YaRN: Efficient Context Window Extension of Large Language Models
v1v2 (latest)

YaRN: Efficient Context Window Extension of Large Language Models

31 August 2023
Bowen Peng
Jeffrey Quesnelle
Honglu Fan
Enrico Shippole
    OSLM
ArXiv (abs)PDFHTMLGithub (1489★)

Papers citing "YaRN: Efficient Context Window Extension of Large Language Models"

50 / 199 papers shown
Title
ReAttention: Training-Free Infinite Context with Finite Attention Scope
ReAttention: Training-Free Infinite Context with Finite Attention Scope
Xiaoran Liu
Ruixiao Li
Yuerong Song
Zhigeng Liu
Kai Lv
Hang Yan
Hang Yan
Linlin Li
Qun Liu
Xipeng Qiu
LLMAG
64
4
0
21 Jul 2024
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Peng Xu
Ming-Yu Liu
Xianchao Wu
Zihan Liu
Mohammad Shoeybi
Mohammad Shoeybi
Bryan Catanzaro
RALM
162
21
0
19 Jul 2024
Qwen2 Technical Report
Qwen2 Technical Report
An Yang
Baosong Yang
Binyuan Hui
Jian Xu
Bowen Yu
...
Yuqiong Liu
Zeyu Cui
Zhenru Zhang
Zhifang Guo
Zhi-Wei Fan
OSLMVLMMU
236
989
0
15 Jul 2024
Human-like Episodic Memory for Infinite Context LLMs
Human-like Episodic Memory for Infinite Context LLMs
Zafeirios Fountas
Martin A Benfeghoul
Adnan Oomerjee
Fenia Christopoulou
Gerasimos Lampouras
Haitham Bou-Ammar
Jun Wang
88
21
0
12 Jul 2024
FlashAttention-3: Fast and Accurate Attention with Asynchrony and
  Low-precision
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
Jay Shah
Ganesh Bikshandi
Ying Zhang
Vijay Thakkar
Pradeep Ramani
Tri Dao
151
156
0
11 Jul 2024
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via
  Dynamic Sparse Attention
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
Huiqiang Jiang
Yucheng Li
Chengruidong Zhang
Qianhui Wu
Xufang Luo
...
Amir H. Abdi
Dongsheng Li
Chin-Yew Lin
Yuqing Yang
L. Qiu
153
122
0
02 Jul 2024
MMLongBench-Doc: Benchmarking Long-context Document Understanding with
  Visualizations
MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations
Yubo Ma
Yuhang Zang
Liangyu Chen
Meiqi Chen
Yizhu Jiao
...
Liangming Pan
Yu-Gang Jiang
Jiaqi Wang
Yixin Cao
Aixin Sun
ELMRALMVLM
111
33
0
01 Jul 2024
AI-native Memory: A Pathway from LLMs Towards AGI
AI-native Memory: A Pathway from LLMs Towards AGI
Jingbo Shang
Zai Zheng
Jiale Wei
Xiang Ying
Felix Tao
Mindverse Team
LLMAG
115
8
0
26 Jun 2024
UIO-LLMs: Unbiased Incremental Optimization for Long-Context LLMs
UIO-LLMs: Unbiased Incremental Optimization for Long-Context LLMs
Wenhao Li
Mingbao Lin
Mingliang Xu
Shuicheng Yan
Rongrong Ji
71
0
0
26 Jun 2024
Long Context Transfer from Language to Vision
Long Context Transfer from Language to Vision
Peiyuan Zhang
Kaichen Zhang
Bo Li
Guangtao Zeng
Jingkang Yang
Yuanhan Zhang
Ziyue Wang
Haoran Tan
Chunyuan Li
Ziwei Liu
VLM
145
189
0
24 Jun 2024
Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning
Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning
Brandon Huang
Chancharik Mitra
Assaf Arbelle
Leonid Karlinsky
Trevor Darrell
Roei Herzig
101
21
0
21 Jun 2024
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs
Ziyan Jiang
Xueguang Ma
Wenhu Chen
RALM
135
59
0
21 Jun 2024
MedOdyssey: A Medical Domain Benchmark for Long Context Evaluation Up to
  200K Tokens
MedOdyssey: A Medical Domain Benchmark for Long Context Evaluation Up to 200K Tokens
Yongqi Fan
Hongli Sun
Kui Xue
Xiaofan Zhang
Shaoting Zhang
Tong Ruan
119
2
0
21 Jun 2024
ACR: A Benchmark for Automatic Cohort Retrieval
ACR: A Benchmark for Automatic Cohort Retrieval
Dung Ngoc Thai
Victor Ardulov
Jose Ulises Mena
Simran Tiwari
Gleb Erofeev
R. Eskander
Karim Tarabishy
Ravi B Parikh
Wael Salloum
99
1
0
20 Jun 2024
DeciMamba: Exploring the Length Extrapolation Potential of Mamba
DeciMamba: Exploring the Length Extrapolation Potential of Mamba
Assaf Ben-Kish
Itamar Zimerman
Shady Abu Hussein
Nadav Cohen
Amir Globerson
Lior Wolf
Raja Giryes
Mamba
206
20
0
20 Jun 2024
VoCo-LLaMA: Towards Vision Compression with Large Language Models
VoCo-LLaMA: Towards Vision Compression with Large Language Models
Xubing Ye
Yukang Gan
Xiaoke Huang
Yixiao Ge
Yansong Tang
MLLMVLM
130
28
0
18 Jun 2024
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code
  Intelligence
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
DeepSeek-AI
Qihao Zhu
Daya Guo
Zhihong Shao
Dejian Yang
...
Jiashi Li
Chenggang Zhao
Chong Ruan
Fuli Luo
Wenfeng Liang
MoELRMELMVLM
103
209
0
17 Jun 2024
BABILong: Testing the Limits of LLMs with Long Context
  Reasoning-in-a-Haystack
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Ivan Rodkin
Dmitry Sorokin
Artyom Sorokin
Andrey Kravchenko
RALMALMLRMReLMELM
102
82
0
14 Jun 2024
3D-RPE: Enhancing Long-Context Modeling Through 3D Rotary Position
  Encoding
3D-RPE: Enhancing Long-Context Modeling Through 3D Rotary Position Encoding
Xindian Ma
Wenyuan Liu
Peng Zhang
Nan Xu
68
3
0
14 Jun 2024
From Text to Life: On the Reciprocal Relationship between Artificial
  Life and Large Language Models
From Text to Life: On the Reciprocal Relationship between Artificial Life and Large Language Models
Eleni Nisioti
Claire Glanois
Elias Najarro
Andrew Dai
Elliot Meyerson
J. Pedersen
Laetitia Teodorescu
Conor F. Hayes
Shyam Sudhakaran
Sebastian Risi
AI4CELM&Ro
103
4
0
14 Jun 2024
LieRE: Lie Rotational Positional Encodings
LieRE: Lie Rotational Positional Encodings
Sophie Ostmeier
Brian Axelrod
Michael E. Moseley
Akshay S. Chaudhari
Akshay Chaudhari
C. Langlotz
88
1
0
14 Jun 2024
LoCoCo: Dropping In Convolutions for Long Context Compression
LoCoCo: Dropping In Convolutions for Long Context Compression
Ruisi Cai
Yuandong Tian
Zhangyang Wang
Beidi Chen
95
11
0
08 Jun 2024
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT
Le Zhuo
Ruoyi Du
Han Xiao
Yangguang Li
Dongyang Liu
...
Wanli Ouyang
Ziwei Liu
Ping Luo
Hongsheng Li
Peng Gao
112
58
0
05 Jun 2024
Mitigate Position Bias in Large Language Models via Scaling a Single Dimension
Mitigate Position Bias in Large Language Models via Scaling a Single Dimension
Yijiong Yu
Huiqiang Jiang
Xufang Luo
Qianhui Wu
Chin-Yew Lin
Dongsheng Li
Yuqing Yang
Yongfeng Huang
L. Qiu
125
10
0
04 Jun 2024
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code
  Completion Abilities of Code Large Language Models
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models
Ken Deng
Jiaheng Liu
He Zhu
Congnan Liu
Jingxin Li
...
Yuanxing Zhang
Wenbo Su
Bangyu Xiang
Tiezheng Ge
Bo Zheng
111
4
0
03 Jun 2024
PostDoc: Generating Poster from a Long Multimodal Document Using Deep
  Submodular Optimization
PostDoc: Generating Poster from a Long Multimodal Document Using Deep Submodular Optimization
Vijay Jaisankar
Sambaran Bandyopadhyay
Kalp Vyas
Varre Chaitanya
Shwetha Somasundaram
58
3
0
30 May 2024
Quest: Query-centric Data Synthesis Approach for Long-context Scaling of
  Large Language Model
Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model
Chaochen Gao
Xing Wu
Qingfang Fu
Songlin Hu
SyDa
110
7
0
30 May 2024
XL3M: A Training-free Framework for LLM Length Extension Based on
  Segment-wise Inference
XL3M: A Training-free Framework for LLM Length Extension Based on Segment-wise Inference
Shengnan Wang
Youhui Bai
Lin Zhang
Pingyi Zhou
Shixiong Zhao
Gong Zhang
Sen Wang
Renhai Chen
Hua Xu
Hongwei Sun
128
5
0
28 May 2024
Transformers Can Do Arithmetic with the Right Embeddings
Transformers Can Do Arithmetic with the Right Embeddings
Sean McLeish
Arpit Bansal
Alex Stein
Neel Jain
John Kirchenbauer
...
B. Kailkhura
A. Bhatele
Jonas Geiping
Avi Schwarzschild
Tom Goldstein
78
37
0
27 May 2024
TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing
  Graph and Text Mutual Transformations
TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing Graph and Text Mutual Transformations
Zhengwu Zhang
Yuntong Hu
Bo Pan
Chen Ling
Liang Zhao
114
3
0
27 May 2024
Compressing Lengthy Context With UltraGist
Compressing Lengthy Context With UltraGist
Peitian Zhang
Zheng Liu
Shitao Xiao
Ninglu Shao
Qiwei Ye
Zhicheng Dou
46
4
0
26 May 2024
Are Long-LLMs A Necessity For Long-Context Tasks?
Are Long-LLMs A Necessity For Long-Context Tasks?
Hongjin Qian
Zheng Liu
Peitian Zhang
Kelong Mao
Yujia Zhou
Xu Chen
Zhicheng Dou
71
13
0
24 May 2024
Base of RoPE Bounds Context Length
Base of RoPE Bounds Context Length
Xin Men
Mingyu Xu
Bingning Wang
Qingyu Zhang
Hongyu Lin
Xianpei Han
Weipeng Chen
101
26
0
23 May 2024
Equipping Transformer with Random-Access Reading for Long-Context
  Understanding
Equipping Transformer with Random-Access Reading for Long-Context Understanding
Chenghao Yang
Zi Yang
Nan Hua
67
1
0
21 May 2024
RoTHP: Rotary Position Embedding-based Transformer Hawkes Process
RoTHP: Rotary Position Embedding-based Transformer Hawkes Process
Anningzhe Gao
Shan Dai
49
3
0
11 May 2024
Linearizing Large Language Models
Linearizing Large Language Models
Jean Mercat
Igor Vasiljevic
Sedrick Scott Keh
Kushal Arora
Achal Dave
Adrien Gaidon
Thomas Kollar
101
24
0
10 May 2024
Lumina-T2X: Transforming Text into Any Modality, Resolution, and
  Duration via Flow-based Large Diffusion Transformers
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
Peng Gao
Le Zhuo
Ziyi Lin
Ruoyi Du
Xu Luo
...
Weicai Ye
He Tong
Jingwen He
Yu Qiao
Hongsheng Li
VGen
103
91
0
09 May 2024
You Only Cache Once: Decoder-Decoder Architectures for Language Models
You Only Cache Once: Decoder-Decoder Architectures for Language Models
Yutao Sun
Li Dong
Yi Zhu
Shaohan Huang
Wenhui Wang
Shuming Ma
Quanlu Zhang
Jianyong Wang
Furu Wei
VLM
99
64
0
08 May 2024
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts
  Language Model
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeek-AI
Aixin Liu
Bei Feng
Bin Wang
Bingxuan Wang
...
Zhuoshu Li
Zihan Wang
Zihui Gu
Zilin Li
Ziwei Xie
MoE
170
500
0
07 May 2024
Long Context Alignment with Short Instructions and Synthesized Positions
Long Context Alignment with Short Instructions and Synthesized Positions
Wenhao Wu
Yizhong Wang
Yao Fu
Xiang Yue
Dawei Zhu
Sujian Li
SyDa
80
19
0
07 May 2024
Make Your LLM Fully Utilize the Context
Make Your LLM Fully Utilize the Context
Shengnan An
Zexiong Ma
Zeqi Lin
Nanning Zheng
Jian-Guang Lou
SyDa
153
67
0
25 Apr 2024
LongEmbed: Extending Embedding Models for Long Context Retrieval
LongEmbed: Extending Embedding Models for Long Context Retrieval
Dawei Zhu
Liang Wang
Nan Yang
Yifan Song
Wenhao Wu
Furu Wei
Sujian Li
RALM
95
27
0
18 Apr 2024
TriForce: Lossless Acceleration of Long Sequence Generation with
  Hierarchical Speculative Decoding
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
Hanshi Sun
Zhuoming Chen
Xinyu Yang
Yuandong Tian
Beidi Chen
121
65
0
18 Apr 2024
Hierarchical Context Merging: Better Long Context Understanding for
  Pre-trained LLMs
Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs
Woomin Song
Seunghyuk Oh
Sangwoo Mo
Jaehyung Kim
Sukmin Yun
Jung-Woo Ha
Jinwoo Shin
74
21
0
16 Apr 2024
Megalodon: Efficient LLM Pretraining and Inference with Unlimited
  Context Length
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Xuezhe Ma
Xiaomeng Yang
Wenhan Xiong
Beidi Chen
Lili Yu
Hao Zhang
Jonathan May
Luke Zettlemoyer
Omer Levy
Chunting Zhou
90
33
0
12 Apr 2024
LLoCO: Learning Long Contexts Offline
LLoCO: Learning Long Contexts Offline
Sijun Tan
Xiuyu Li
Shishir G. Patil
Ziyang Wu
Tianjun Zhang
Kurt Keutzer
Joseph E. Gonzalez
Raluca A. Popa
RALMOffRLLLMAG
102
8
0
11 Apr 2024
Leave No Context Behind: Efficient Infinite Context Transformers with
  Infini-attention
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Tsendsuren Munkhdalai
Manaal Faruqui
Siddharth Gopal
LRMLLMAGCLL
157
124
0
10 Apr 2024
MiniCPM: Unveiling the Potential of Small Language Models with Scalable
  Training Strategies
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
Shengding Hu
Yuge Tu
Xu Han
Chaoqun He
Ganqu Cui
...
Chaochao Jia
Guoyang Zeng
Dahai Li
Zhiyuan Liu
Maosong Sun
MoE
131
347
0
09 Apr 2024
Long-context LLMs Struggle with Long In-context Learning
Long-context LLMs Struggle with Long In-context Learning
Tianle Li
Ge Zhang
Quy Duc Do
Xiang Yue
Wenhu Chen
103
194
0
02 Apr 2024
A Survey on Large Language Model-Based Game Agents
A Survey on Large Language Model-Based Game Agents
Sihao Hu
Tiansheng Huang
Gaowen Liu
Ramana Rao Kompella
Gaowen Liu
Selim Furkan Tekin
Yichang Xu
Zachary Yahn
Ling Liu
LLMAGLM&RoAI4CELM&MA
231
58
0
02 Apr 2024
Previous
1234
Next