ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.00071
  4. Cited By
YaRN: Efficient Context Window Extension of Large Language Models

YaRN: Efficient Context Window Extension of Large Language Models

31 August 2023
Bowen Peng
Jeffrey Quesnelle
Honglu Fan
Enrico Shippole
    OSLM
ArXivPDFHTML

Papers citing "YaRN: Efficient Context Window Extension of Large Language Models"

50 / 178 papers shown
Title
LoCoCo: Dropping In Convolutions for Long Context Compression
LoCoCo: Dropping In Convolutions for Long Context Compression
Ruisi Cai
Yuandong Tian
Zhangyang Wang
Beidi Chen
49
10
0
08 Jun 2024
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT
Le Zhuo
Ruoyi Du
Han Xiao
Yangguang Li
Dongyang Liu
...
Wanli Ouyang
Ziwei Liu
Ping Luo
Hongsheng Li
Peng Gao
52
47
0
05 Jun 2024
Mitigate Position Bias in Large Language Models via Scaling a Single
  Dimension
Mitigate Position Bias in Large Language Models via Scaling a Single Dimension
Yijiong Yu
Huiqiang Jiang
Xufang Luo
Qianhui Wu
Chin-Yew Lin
Dongsheng Li
Yuqing Yang
Yongfeng Huang
L. Qiu
52
9
0
04 Jun 2024
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code
  Completion Abilities of Code Large Language Models
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models
Ken Deng
Jiaheng Liu
He Zhu
Congnan Liu
Jingxin Li
...
Yuanxing Zhang
Wenbo Su
Bangyu Xiang
Tiezheng Ge
Bo Zheng
50
2
0
03 Jun 2024
PostDoc: Generating Poster from a Long Multimodal Document Using Deep
  Submodular Optimization
PostDoc: Generating Poster from a Long Multimodal Document Using Deep Submodular Optimization
Vijay Jaisankar
Sambaran Bandyopadhyay
Kalp Vyas
Varre Chaitanya
Shwetha Somasundaram
32
2
0
30 May 2024
Quest: Query-centric Data Synthesis Approach for Long-context Scaling of
  Large Language Model
Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model
Chaochen Gao
Xing Wu
Qingfang Fu
Songlin Hu
SyDa
34
5
0
30 May 2024
XL3M: A Training-free Framework for LLM Length Extension Based on
  Segment-wise Inference
XL3M: A Training-free Framework for LLM Length Extension Based on Segment-wise Inference
Shengnan Wang
Youhui Bai
Lin Zhang
Pingyi Zhou
Shixiong Zhao
Gong Zhang
Sen Wang
Renhai Chen
Hua Xu
Hongwei Sun
36
3
0
28 May 2024
Transformers Can Do Arithmetic with the Right Embeddings
Transformers Can Do Arithmetic with the Right Embeddings
Sean McLeish
Arpit Bansal
Alex Stein
Neel Jain
John Kirchenbauer
...
B. Kailkhura
A. Bhatele
Jonas Geiping
Avi Schwarzschild
Tom Goldstein
53
31
0
27 May 2024
TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing
  Graph and Text Mutual Transformations
TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing Graph and Text Mutual Transformations
Zhengwu Zhang
Yuntong Hu
Bo Pan
Chen Ling
Liang Zhao
46
2
0
27 May 2024
Compressing Lengthy Context With UltraGist
Compressing Lengthy Context With UltraGist
Peitian Zhang
Zheng Liu
Shitao Xiao
Ninglu Shao
Qiwei Ye
Zhicheng Dou
40
4
0
26 May 2024
Are Long-LLMs A Necessity For Long-Context Tasks?
Are Long-LLMs A Necessity For Long-Context Tasks?
Hongjin Qian
Zheng Liu
Peitian Zhang
Kelong Mao
Yujia Zhou
Xu Chen
Zhicheng Dou
42
9
0
24 May 2024
Base of RoPE Bounds Context Length
Base of RoPE Bounds Context Length
Xin Men
Mingyu Xu
Bingning Wang
Qingyu Zhang
Hongyu Lin
Xianpei Han
Weipeng Chen
42
20
0
23 May 2024
Equipping Transformer with Random-Access Reading for Long-Context
  Understanding
Equipping Transformer with Random-Access Reading for Long-Context Understanding
Chenghao Yang
Zi Yang
Nan Hua
32
1
0
21 May 2024
RoTHP: Rotary Position Embedding-based Transformer Hawkes Process
RoTHP: Rotary Position Embedding-based Transformer Hawkes Process
Anningzhe Gao
Shan Dai
31
3
0
11 May 2024
Linearizing Large Language Models
Linearizing Large Language Models
Jean Mercat
Igor Vasiljevic
Sedrick Scott Keh
Kushal Arora
Achal Dave
Adrien Gaidon
Thomas Kollar
46
19
0
10 May 2024
Lumina-T2X: Transforming Text into Any Modality, Resolution, and
  Duration via Flow-based Large Diffusion Transformers
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
Peng Gao
Le Zhuo
Ziyi Lin
Ruoyi Du
Xu Luo
...
Weicai Ye
He Tong
Jingwen He
Yu Qiao
Hongsheng Li
VGen
37
84
0
09 May 2024
You Only Cache Once: Decoder-Decoder Architectures for Language Models
You Only Cache Once: Decoder-Decoder Architectures for Language Models
Yutao Sun
Li Dong
Yi Zhu
Shaohan Huang
Wenhui Wang
Shuming Ma
Quanlu Zhang
Jianyong Wang
Furu Wei
VLM
38
56
0
08 May 2024
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts
  Language Model
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeek-AI
Aixin Liu
Bei Feng
Bin Wang
Bingxuan Wang
...
Zhuoshu Li
Zihan Wang
Zihui Gu
Zilin Li
Ziwei Xie
MoE
63
399
0
07 May 2024
Long Context Alignment with Short Instructions and Synthesized Positions
Long Context Alignment with Short Instructions and Synthesized Positions
Wenhao Wu
Yizhong Wang
Yao Fu
Xiang Yue
Dawei Zhu
Sujian Li
SyDa
54
18
0
07 May 2024
Make Your LLM Fully Utilize the Context
Make Your LLM Fully Utilize the Context
Shengnan An
Zexiong Ma
Zeqi Lin
Nanning Zheng
Jian-Guang Lou
SyDa
59
55
0
25 Apr 2024
LongEmbed: Extending Embedding Models for Long Context Retrieval
LongEmbed: Extending Embedding Models for Long Context Retrieval
Dawei Zhu
Liang Wang
Nan Yang
Yifan Song
Wenhao Wu
Furu Wei
Sujian Li
RALM
47
22
0
18 Apr 2024
TriForce: Lossless Acceleration of Long Sequence Generation with
  Hierarchical Speculative Decoding
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
Hanshi Sun
Zhuoming Chen
Xinyu Yang
Yuandong Tian
Beidi Chen
46
49
0
18 Apr 2024
Hierarchical Context Merging: Better Long Context Understanding for
  Pre-trained LLMs
Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs
Woomin Song
Seunghyuk Oh
Sangwoo Mo
Jaehyung Kim
Sukmin Yun
Jung-Woo Ha
Jinwoo Shin
40
14
0
16 Apr 2024
Megalodon: Efficient LLM Pretraining and Inference with Unlimited
  Context Length
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Xuezhe Ma
Xiaomeng Yang
Wenhan Xiong
Beidi Chen
Lili Yu
Hao Zhang
Jonathan May
Luke Zettlemoyer
Omer Levy
Chunting Zhou
53
27
0
12 Apr 2024
LLoCO: Learning Long Contexts Offline
LLoCO: Learning Long Contexts Offline
Sijun Tan
Xiuyu Li
Shishir G. Patil
Ziyang Wu
Tianjun Zhang
Kurt Keutzer
Joseph E. Gonzalez
Raluca A. Popa
RALM
OffRL
LLMAG
46
6
0
11 Apr 2024
Leave No Context Behind: Efficient Infinite Context Transformers with
  Infini-attention
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Tsendsuren Munkhdalai
Manaal Faruqui
Siddharth Gopal
LRM
LLMAG
CLL
91
104
0
10 Apr 2024
MiniCPM: Unveiling the Potential of Small Language Models with Scalable
  Training Strategies
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
Shengding Hu
Yuge Tu
Xu Han
Chaoqun He
Yuchen Zhang
...
Chaochao Jia
Guoyang Zeng
Dahai Li
Zhiyuan Liu
Maosong Sun
MoE
51
293
0
09 Apr 2024
Long-context LLMs Struggle with Long In-context Learning
Long-context LLMs Struggle with Long In-context Learning
Tianle Li
Ge Zhang
Quy Duc Do
Xiang Yue
Wenhu Chen
56
164
0
02 Apr 2024
A Survey on Large Language Model-Based Game Agents
A Survey on Large Language Model-Based Game Agents
Sihao Hu
Tiansheng Huang
Gaowen Liu
Ramana Rao Kompella
Gaowen Liu
Selim Furkan Tekin
Yichang Xu
Zachary Yahn
Ling Liu
LLMAG
LM&Ro
AI4CE
LM&MA
71
52
0
02 Apr 2024
MEP: Multiple Kernel Learning Enhancing Relative Positional Encoding
  Length Extrapolation
MEP: Multiple Kernel Learning Enhancing Relative Positional Encoding Length Extrapolation
Weiguo Gao
42
1
0
26 Mar 2024
AIOS: LLM Agent Operating System
AIOS: LLM Agent Operating System
Kai Mei
Zelong Li
Wujiang Xu
Wenyue Hua
Mingyu Jin
Yongfeng Zhang
Shuyuan Xu
Ruosong Ye
Yingqiang Ge
Yongfeng Zhang
LLMAG
30
17
0
25 Mar 2024
Holographic Global Convolutional Networks for Long-Range Prediction
  Tasks in Malware Detection
Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection
Mohammad Mahmudul Alam
Edward Raff
Stella Biderman
Tim Oates
James Holt
AAML
38
3
0
23 Mar 2024
StreamingDialogue: Prolonged Dialogue Learning via Long Context
  Compression with Minimal Losses
StreamingDialogue: Prolonged Dialogue Learning via Long Context Compression with Minimal Losses
Jia-Nan Li
Quan Tu
Cunli Mao
Zhengtao Yu
Ji-Rong Wen
Rui Yan
OffRL
29
3
0
13 Mar 2024
Breeze-7B Technical Report
Breeze-7B Technical Report
Chan-Jan Hsu
Chang-Le Liu
Feng-Ting Liao
Po-Chun Hsu
Yi-Chang Chen
Da-Shan Shiu
34
2
0
05 Mar 2024
Found in the Middle: How Language Models Use Long Contexts Better via
  Plug-and-Play Positional Encoding
Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding
Zhenyu Zhang
Runjin Chen
Shiwei Liu
Zhewei Yao
Olatunji Ruwase
Beidi Chen
Xiaoxia Wu
Zhangyang Wang
34
26
0
05 Mar 2024
Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral
Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral
Yiming Cui
Xin Yao
30
4
0
04 Mar 2024
LLM Inference Unveiled: Survey and Roofline Model Insights
LLM Inference Unveiled: Survey and Roofline Model Insights
Zhihang Yuan
Yuzhang Shang
Yang Zhou
Zhen Dong
Zhe Zhou
...
Yong Jae Lee
Yan Yan
Beidi Chen
Guangyu Sun
Kurt Keutzer
61
82
0
26 Feb 2024
Beyond A*: Better Planning with Transformers via Search Dynamics
  Bootstrapping
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
Lucas Lehnert
Sainbayar Sukhbaatar
DiJia Su
Qinqing Zheng
Paul Mcvay
Michael Rabbat
Yuandong Tian
37
54
0
21 Feb 2024
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Yiran Ding
Li Lyna Zhang
Chengruidong Zhang
Yuanyuan Xu
Ning Shang
Jiahang Xu
Fan Yang
Mao Yang
RALM
48
136
0
21 Feb 2024
Transformers Can Achieve Length Generalization But Not Robustly
Transformers Can Achieve Length Generalization But Not Robustly
Yongchao Zhou
Uri Alon
Xinyun Chen
Xuezhi Wang
Rishabh Agarwal
Denny Zhou
52
36
0
14 Feb 2024
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an
  Efficient Context Memory
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
Chaojun Xiao
Pengle Zhang
Xu Han
Guangxuan Xiao
Yankai Lin
Zhengyan Zhang
Zhiyuan Liu
Maosong Sun
LLMAG
47
35
0
07 Feb 2024
LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to
  256K
LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K
Tao Yuan
Xuefei Ning
Dong Zhou
Zhijie Yang
Shiyao Li
...
Dahua Lin
Boxun Li
Guohao Dai
Shengen Yan
Yu Wang
ALM
40
34
0
06 Feb 2024
EscherNet: A Generative Model for Scalable View Synthesis
EscherNet: A Generative Model for Scalable View Synthesis
Xin Kong
Shikun Liu
Xiaoyang Lyu
Marwan Taher
Xiaojuan Qi
Andrew J. Davison
DiffM
88
42
0
06 Feb 2024
UniMem: Towards a Unified View of Long-Context Large Language Models
UniMem: Towards a Unified View of Long-Context Large Language Models
Junjie Fang
Likai Tang
Hongzhe Bi
Yujia Qin
Si Sun
...
Xiaodong Shi
Sen Song
Yankai Lin
Zhiyuan Liu
Maosong Sun
27
3
0
05 Feb 2024
Zero-Shot Clinical Trial Patient Matching with LLMs
Zero-Shot Clinical Trial Patient Matching with LLMs
Michael Wornow
Alejandro Lozano
Dev Dash
Jenelle A. Jindal
Kenneth W. Mahaffey
Nigam H. Shah
46
28
0
05 Feb 2024
Beyond the Limits: A Survey of Techniques to Extend the Context Length
  in Large Language Models
Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models
Xindi Wang
Mahsa Salmani
Parsa Omidi
Xiangyu Ren
Mehdi Rezagholizadeh
A. Eshaghi
LRM
39
36
0
03 Feb 2024
Nomic Embed: Training a Reproducible Long Context Text Embedder
Nomic Embed: Training a Reproducible Long Context Text Embedder
Zach Nussbaum
John X. Morris
Brandon Duderstadt
Andriy Mulyar
27
100
0
02 Feb 2024
BlackMamba: Mixture of Experts for State-Space Models
BlackMamba: Mixture of Experts for State-Space Models
Quentin G. Anthony
Yury Tokpanov
Paolo Glorioso
Beren Millidge
38
21
0
01 Feb 2024
LongAlign: A Recipe for Long Context Alignment of Large Language Models
LongAlign: A Recipe for Long Context Alignment of Large Language Models
Yushi Bai
Xin Lv
Jiajie Zhang
Yuze He
Ji Qi
Lei Hou
Jie Tang
Yuxiao Dong
Juanzi Li
ALM
42
46
0
31 Jan 2024
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
Zongxin Yang
Guikun Chen
Xiaodi Li
Wenguan Wang
Yi Yang
LM&Ro
LLMAG
69
35
0
16 Jan 2024
Previous
1234
Next