ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.00071
  4. Cited By
YaRN: Efficient Context Window Extension of Large Language Models

YaRN: Efficient Context Window Extension of Large Language Models

31 August 2023
Bowen Peng
Jeffrey Quesnelle
Honglu Fan
Enrico Shippole
    OSLM
ArXivPDFHTML

Papers citing "YaRN: Efficient Context Window Extension of Large Language Models"

50 / 178 papers shown
Title
Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced
  Extrapolation in LLMs
Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs
Xin Ma
Yang Liu
Qingbin Liu
Xiaoxu Ma
31
1
0
21 Oct 2024
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion
  Model
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model
ZiDong Wang
Zeyu Lu
Di Huang
Cai Zhou
Wanli Ouyang
and Lei Bai
76
3
0
17 Oct 2024
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Yizhao Gao
Zhichen Zeng
Dayou Du
Shijie Cao
Hayden Kwok-Hay So
...
Junjie Lai
Mao Yang
Ting Cao
Fan Yang
M. Yang
54
19
0
17 Oct 2024
HSR-Enhanced Sparse Attention Acceleration
HSR-Enhanced Sparse Attention Acceleration
Bo Chen
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao Song
95
19
0
14 Oct 2024
TULIP: Token-length Upgraded CLIP
TULIP: Token-length Upgraded CLIP
Ivona Najdenkoska
Mohammad Mahdi Derakhshani
Yuki M. Asano
Nanne van Noord
Marcel Worring
Cees G. M. Snoek
VLM
53
3
0
13 Oct 2024
ACER: Automatic Language Model Context Extension via Retrieval
ACER: Automatic Language Model Context Extension via Retrieval
Luyu Gao
Yunyi Zhang
Jamie Callan
RALM
41
0
0
11 Oct 2024
Stuffed Mamba: State Collapse and State Capacity of RNN-Based
  Long-Context Modeling
Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling
Yingfa Chen
Xinrong Zhang
Shengding Hu
Xu Han
Zhiyuan Liu
Maosong Sun
Mamba
59
2
0
09 Oct 2024
FltLM: An Intergrated Long-Context Large Language Model for Effective
  Context Filtering and Understanding
FltLM: An Intergrated Long-Context Large Language Model for Effective Context Filtering and Understanding
Jingyang Deng
Zhengyang Shen
Boyang Wang
Lixin Su
Suqi Cheng
Ying Nie
Junfeng Wang
Dawei Yin
Jinwen Ma
39
1
0
09 Oct 2024
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Chuanyang Zheng
Yihang Gao
Han Shi
Jing Xiong
Jiankai Sun
...
Xiaozhe Ren
Michael Ng
Xin Jiang
Zhenguo Li
Yu Li
36
3
0
07 Oct 2024
GARLIC: LLM-Guided Dynamic Progress Control with Hierarchical Weighted
  Graph for Long Document QA
GARLIC: LLM-Guided Dynamic Progress Control with Hierarchical Weighted Graph for Long Document QA
Xinyu Wang
Yanzheng Xiang
Lin Gui
Yulan He
31
2
0
07 Oct 2024
Accelerating Inference of Networks in the Frequency Domain
Accelerating Inference of Networks in the Frequency Domain
Chenqiu Zhao
Guanfang Dong
Anup Basu
48
0
0
06 Oct 2024
UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling
  for Retrieval-Augmented Generation
UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation
Zixuan Li
Jing Xiong
Fanghua Ye
Chuanyang Zheng
Xun Wu
...
Xiaodan Liang
Chengming Li
Zhenan Sun
Lingpeng Kong
Ngai Wong
RALM
UQLM
32
2
0
03 Oct 2024
L-CiteEval: Do Long-Context Models Truly Leverage Context for
  Responding?
L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding?
Zecheng Tang
Keyan Zhou
Juntao Li
Baibei Ji
Jianye Hou
Min Zhang
49
2
0
03 Oct 2024
How to Train Long-Context Language Models (Effectively)
How to Train Long-Context Language Models (Effectively)
Tianyu Gao
Alexander Wettig
Howard Yen
Danqi Chen
RALM
77
39
0
03 Oct 2024
HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly
HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly
Howard Yen
Tianyu Gao
Minmin Hou
Ke Ding
Daniel Fleischer
Peter Izsak
Moshe Wasserblat
Danqi Chen
ALM
ELM
67
27
0
03 Oct 2024
Beyond Prompts: Dynamic Conversational Benchmarking of Large Language
  Models
Beyond Prompts: Dynamic Conversational Benchmarking of Large Language Models
David Castillo-Bolado
Joseph Davidson
Finlay Gray
Marek Rosa
34
3
0
30 Sep 2024
Visual Context Window Extension: A New Perspective for Long Video
  Understanding
Visual Context Window Extension: A New Perspective for Long Video Understanding
Hongchen Wei
Zhenzhong Chen
VLM
34
6
0
30 Sep 2024
HelloBench: Evaluating Long Text Generation Capabilities of Large
  Language Models
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models
Haoran Que
Feiyu Duan
Liqun He
Yutao Mou
Wangchunshu Zhou
...
Ge Zhang
Junran Peng
Zhaoxiang Zhang
Songyang Zhang
Kai Chen
LM&MA
ELM
VLM
56
11
0
24 Sep 2024
Towards LifeSpan Cognitive Systems
Towards LifeSpan Cognitive Systems
Yu Wang
Chi Han
Tongtong Wu
Xiaoxin He
Wangchunshu Zhou
...
Zexue He
Wei Wang
Gholamreza Haffari
Heng Ji
Julian McAuley
KELM
CLL
227
1
0
20 Sep 2024
Interpolating Video-LLMs: Toward Longer-sequence LMMs in a Training-free
  Manner
Interpolating Video-LLMs: Toward Longer-sequence LMMs in a Training-free Manner
Yuzhang Shang
Bingxin Xu
Weitai Kang
Mu Cai
Yuheng Li
Zehao Wen
Zhen Dong
Kurt Keutzer
Yong Jae Lee
Yan Yan
41
7
0
19 Sep 2024
Qwen2.5-Coder Technical Report
Qwen2.5-Coder Technical Report
Binyuan Hui
Jian Yang
Zeyu Cui
Jiaxi Yang
Dayiheng Liu
...
Fei Huang
Xingzhang Ren
Xuancheng Ren
Jingren Zhou
Junyang Lin
OSLM
81
227
0
18 Sep 2024
E2LLM: Encoder Elongated Large Language Models for Long-Context
  Understanding and Reasoning
E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning
Zihan Liao
Jun Wang
Hang Yu
Lingxiao Wei
Jianguo Li
Jun Wang
Wei Zhang
29
3
0
10 Sep 2024
You Only Use Reactive Attention Slice For Long Context Retrieval
You Only Use Reactive Attention Slice For Long Context Retrieval
Yun Joon Soh
Hanxian Huang
Yuandong Tian
Jishen Zhao
RALM
51
0
0
03 Sep 2024
What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices
What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices
Zhi Chen
Qiguang Chen
Libo Qin
Qipeng Guo
Haijun Lv
Yicheng Zou
Wanxiang Che
Hang Yan
K. Chen
Dahua Lin
SyDa
56
4
0
03 Sep 2024
LongRecipe: Recipe for Efficient Long Context Generalization in Large
  Language Models
LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models
Zhiyuan Hu
Yuliang Liu
Jinman Zhao
Suyuchen Wang
Yan Wang
...
Qing Gu
Anh Tuan Luu
See-Kiong Ng
Zhiwei Jiang
Bryan Hooi
55
11
0
31 Aug 2024
Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer
Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer
Jinghan Yao
Sam Ade Jacobs
Masahiro Tanaka
Olatunji Ruwase
Hari Subramoni
D. Panda
33
2
0
30 Aug 2024
Writing in the Margins: Better Inference Pattern for Long Context
  Retrieval
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
M. Russak
Umar Jamil
Christopher Bryant
Kiran Kamble
Axel Magnuson
Mateusz Russak
Waseem Alshikh
35
2
0
27 Aug 2024
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
Yushi Bai
Jiajie Zhang
Xin Lv
Linzhi Zheng
Siqi Zhu
Lei Hou
Yuxiao Dong
Jie Tang
Juanzi Li
VGen
LLMAG
ALM
47
42
0
13 Aug 2024
LUT Tensor Core: A Software-Hardware Co-Design for LUT-Based Low-Bit LLM Inference
LUT Tensor Core: A Software-Hardware Co-Design for LUT-Based Low-Bit LLM Inference
Zhiwen Mo
Lei Wang
Jianyu Wei
Zhichen Zeng
Shijie Cao
...
Naifeng Jing
Ting Cao
Jilong Xue
Fan Yang
Mao Yang
54
0
0
12 Aug 2024
ReAttention: Training-Free Infinite Context with Finite Attention Scope
ReAttention: Training-Free Infinite Context with Finite Attention Scope
Xiaoran Liu
Ruixiao Li
Yuerong Song
Zhigeng Liu
Kai Lv
Hang Yan
Hang Yan
Linlin Li
Qun Liu
Xipeng Qiu
LLMAG
38
1
0
21 Jul 2024
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Peng Xu
Ming-Yu Liu
Xianchao Wu
Zihan Liu
M. Shoeybi
Mohammad Shoeybi
Bryan Catanzaro
RALM
52
15
0
19 Jul 2024
Qwen2 Technical Report
Qwen2 Technical Report
An Yang
Baosong Yang
Binyuan Hui
Jian Xu
Bowen Yu
...
Yuqiong Liu
Zeyu Cui
Zhenru Zhang
Zhifang Guo
Zhi-Wei Fan
OSLM
VLM
MU
60
815
0
15 Jul 2024
Human-like Episodic Memory for Infinite Context LLMs
Human-like Episodic Memory for Infinite Context LLMs
Z. Fountas
Martin A Benfeghoul
Adnan Oomerjee
Fenia Christopoulou
Gerasimos Lampouras
Haitham Bou-Ammar
Jun Wang
31
19
0
12 Jul 2024
FlashAttention-3: Fast and Accurate Attention with Asynchrony and
  Low-precision
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
Jay Shah
Ganesh Bikshandi
Ying Zhang
Vijay Thakkar
Pradeep Ramani
Tri Dao
76
117
0
11 Jul 2024
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via
  Dynamic Sparse Attention
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
Huiqiang Jiang
Yucheng Li
Chengruidong Zhang
Qianhui Wu
Xufang Luo
...
Amir H. Abdi
Dongsheng Li
Chin-Yew Lin
Yuqing Yang
L. Qiu
72
87
0
02 Jul 2024
MMLongBench-Doc: Benchmarking Long-context Document Understanding with
  Visualizations
MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations
Yubo Ma
Yuhang Zang
Liangyu Chen
Meiqi Chen
Yizhu Jiao
...
Liangming Pan
Yu-Gang Jiang
Jiaqi Wang
Yixin Cao
Aixin Sun
ELM
RALM
VLM
39
25
0
01 Jul 2024
AI-native Memory: A Pathway from LLMs Towards AGI
AI-native Memory: A Pathway from LLMs Towards AGI
Jingbo Shang
Zai Zheng
Jiale Wei
Xiang Ying
Felix Tao
Mindverse Team
LLMAG
58
7
0
26 Jun 2024
UIO-LLMs: Unbiased Incremental Optimization for Long-Context LLMs
UIO-LLMs: Unbiased Incremental Optimization for Long-Context LLMs
Wenhao Li
Mingbao Lin
Mingliang Xu
Shuicheng Yan
Rongrong Ji
43
0
0
26 Jun 2024
Long Context Transfer from Language to Vision
Long Context Transfer from Language to Vision
Peiyuan Zhang
Kaichen Zhang
Bo Li
Guangtao Zeng
Jingkang Yang
Yuanhan Zhang
Ziyue Wang
Haoran Tan
Chunyuan Li
Ziwei Liu
VLM
72
146
0
24 Jun 2024
Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning
Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning
Brandon Huang
Chancharik Mitra
Assaf Arbelle
Leonid Karlinsky
Trevor Darrell
Roei Herzig
49
13
0
21 Jun 2024
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs
Ziyan Jiang
Xueguang Ma
Wenhu Chen
RALM
60
48
0
21 Jun 2024
MedOdyssey: A Medical Domain Benchmark for Long Context Evaluation Up to
  200K Tokens
MedOdyssey: A Medical Domain Benchmark for Long Context Evaluation Up to 200K Tokens
Yongqi Fan
Hongli Sun
Kui Xue
Xiaofan Zhang
Shaoting Zhang
Tong Ruan
47
0
0
21 Jun 2024
ACR: A Benchmark for Automatic Cohort Retrieval
ACR: A Benchmark for Automatic Cohort Retrieval
Dung Ngoc Thai
Victor Ardulov
Jose Ulises Mena
Simran Tiwari
Gleb Erofeev
R. Eskander
Karim Tarabishy
Ravi B Parikh
Wael Salloum
76
0
0
20 Jun 2024
DeciMamba: Exploring the Length Extrapolation Potential of Mamba
DeciMamba: Exploring the Length Extrapolation Potential of Mamba
Assaf Ben-Kish
Itamar Zimerman
Shady Abu Hussein
Nadav Cohen
Amir Globerson
Lior Wolf
Raja Giryes
Mamba
77
13
0
20 Jun 2024
VoCo-LLaMA: Towards Vision Compression with Large Language Models
VoCo-LLaMA: Towards Vision Compression with Large Language Models
Xubing Ye
Yukang Gan
Xiaoke Huang
Yixiao Ge
Yansong Tang
MLLM
VLM
43
23
0
18 Jun 2024
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code
  Intelligence
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
DeepSeek-AI
Qihao Zhu
Daya Guo
Zhihong Shao
Dejian Yang
...
Jiashi Li
Chenggang Zhao
Chong Ruan
Fuli Luo
Wenfeng Liang
MoE
LRM
ELM
VLM
48
170
0
17 Jun 2024
BABILong: Testing the Limits of LLMs with Long Context
  Reasoning-in-a-Haystack
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Ivan Rodkin
Dmitry Sorokin
Artyom Sorokin
Andrey Kravchenko
RALM
ALM
LRM
ReLM
ELM
51
61
0
14 Jun 2024
3D-RPE: Enhancing Long-Context Modeling Through 3D Rotary Position
  Encoding
3D-RPE: Enhancing Long-Context Modeling Through 3D Rotary Position Encoding
Xindian Ma
Wenyuan Liu
Peng Zhang
Nan Xu
44
3
0
14 Jun 2024
From Text to Life: On the Reciprocal Relationship between Artificial
  Life and Large Language Models
From Text to Life: On the Reciprocal Relationship between Artificial Life and Large Language Models
Eleni Nisioti
Claire Glanois
Elias Najarro
Andrew Dai
Elliot Meyerson
J. Pedersen
Laetitia Teodorescu
Conor F. Hayes
Shyam Sudhakaran
Sebastian Risi
AI4CE
LM&Ro
56
3
0
14 Jun 2024
LieRE: Generalizing Rotary Position Encodings
LieRE: Generalizing Rotary Position Encodings
Sophie Ostmeier
Brian Axelrod
Michael E. Moseley
Akshay S. Chaudhari
C. Langlotz
36
0
0
14 Jun 2024
Previous
1234
Next