ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.15595
  4. Cited By
Extending Context Window of Large Language Models via Positional
  Interpolation

Extending Context Window of Large Language Models via Positional Interpolation

27 June 2023
Shouyuan Chen
Sherman Wong
Liangjian Chen
Yuandong Tian
ArXivPDFHTML

Papers citing "Extending Context Window of Large Language Models via Positional Interpolation"

38 / 388 papers shown
Title
Efficient Streaming Language Models with Attention Sinks
Efficient Streaming Language Models with Attention Sinks
Michel Lang
Yuandong Tian
Beidi Chen
Song Han
Mike Lewis
AI4TS
RALM
44
666
0
29 Sep 2023
Interpretable Long-Form Legal Question Answering with
  Retrieval-Augmented Large Language Models
Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models
Antoine Louis
Gijs van Dijck
Gerasimos Spanakis
ELM
AILaw
30
35
0
29 Sep 2023
Qwen Technical Report
Qwen Technical Report
Jinze Bai
Shuai Bai
Yunfei Chu
Zeyu Cui
Kai Dang
...
Zhenru Zhang
Chang Zhou
Jingren Zhou
Xiaohuan Zhou
Tianhang Zhu
OSLM
108
1,622
0
28 Sep 2023
Attention Sorting Combats Recency Bias In Long Context Language Models
Attention Sorting Combats Recency Bias In Long Context Language Models
A. Peysakhovich
Adam Lerer
LRM
RALM
49
43
0
28 Sep 2023
Effective Long-Context Scaling of Foundation Models
Effective Long-Context Scaling of Foundation Models
Wenhan Xiong
Jingyu Liu
Igor Molybog
Hejia Zhang
Prajjwal Bhargava
...
Dániel Baráth
Sergey Edunov
Mike Lewis
Sinong Wang
Hao Ma
42
208
0
27 Sep 2023
BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling
  Capacities of Large Language Models
BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language Models
Zican Dong
Tianyi Tang
Junyi Li
Wayne Xin Zhao
Ji-Rong Wen
RALM
ALM
36
34
0
23 Sep 2023
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Yukang Chen
Shengju Qian
Haotian Tang
Xin Lai
Zhijian Liu
Song Han
Jiaya Jia
61
153
0
21 Sep 2023
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
Nolan Dey
Daria Soboleva
Faisal Al-Khateeb
Bowen Yang
Ribhu Pathria
...
Robert Myers
Jacob Robert Steeves
Natalia Vassilieva
Marvin Tom
Joel Hestness
MoE
41
15
0
20 Sep 2023
PoSE: Efficient Context Window Extension of LLMs via Positional
  Skip-wise Training
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training
Dawei Zhu
Nan Yang
Liang Wang
Yifan Song
Wenhao Wu
Furu Wei
Sujian Li
76
78
0
19 Sep 2023
CoCA: Fusing Position Embedding with Collinear Constrained Attention in
  Transformers for Long Context Window Extending
CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending
Shiyi Zhu
Jingting Ye
Wei Jiang
Siqiao Xue
Qi Zhang
Yifan Wu
Jianguo Li
32
4
0
15 Sep 2023
InvestLM: A Large Language Model for Investment using Financial Domain
  Instruction Tuning
InvestLM: A Large Language Model for Investment using Financial Domain Instruction Tuning
Yi Yang
Yixuan Tang
Kar Yan Tam
ALM
AIFin
35
63
0
15 Sep 2023
Large Language Models for Compiler Optimization
Large Language Models for Compiler Optimization
Chris Cummins
Volker Seeker
Dejan Grubisic
Mostafa Elhoushi
Youwei Liang
...
Jonas Gehring
Fabian Gloeckle
Kim M. Hazelwood
Gabriel Synnaeve
Hugh Leather
26
48
0
11 Sep 2023
FLM-101B: An Open LLM and How to Train It with $100K Budget
FLM-101B: An Open LLM and How to Train It with 100KBudget100K Budget100KBudget
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Xuying Meng
...
Li Du
Bowen Qin
Zheng-Wei Zhang
Aixin Sun
Yequan Wang
60
22
0
07 Sep 2023
FArMARe: a Furniture-Aware Multi-task methodology for Recommending
  Apartments based on the user interests
FArMARe: a Furniture-Aware Multi-task methodology for Recommending Apartments based on the user interests
Ali Abdari
Alex Falcon
Giuseppe Serra
34
2
0
06 Sep 2023
YaRN: Efficient Context Window Extension of Large Language Models
YaRN: Efficient Context Window Extension of Large Language Models
Bowen Peng
Jeffrey Quesnelle
Honglu Fan
Enrico Shippole
OSLM
32
226
0
31 Aug 2023
LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language
  Models
LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models
Chi Han
Qifan Wang
Hao Peng
Wenhan Xiong
Yu Chen
Heng Ji
Sinong Wang
50
50
0
30 Aug 2023
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models
Qingyue Wang
Y. Fu
Yanan Cao
Zhiliang Tian
Shi Wang
Dacheng Tao
LLMAG
KELM
RALM
70
25
0
29 Aug 2023
LongBench: A Bilingual, Multitask Benchmark for Long Context
  Understanding
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
Yushi Bai
Xin Lv
Jiajie Zhang
Hong Lyu
Jiankai Tang
...
Aohan Zeng
Lei Hou
Yuxiao Dong
Jie Tang
Juanzi Li
LLMAG
RALM
31
510
0
28 Aug 2023
MedAlign: A Clinician-Generated Dataset for Instruction Following with
  Electronic Medical Records
MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records
Scott L. Fleming
Alejandro Lozano
W. Haberkorn
Jenelle A. Jindal
E. Reis
...
Jonathan Chen
Keith Morse
Emma Brunskill
Jason Alan Fries
N. Shah
LM&MA
30
54
0
27 Aug 2023
Code Llama: Open Foundation Models for Code
Code Llama: Open Foundation Models for Code
Baptiste Rozière
Jonas Gehring
Fabian Gloeckle
Sten Sootla
Itai Gat
...
Hugo Touvron
Louis Martin
Nicolas Usunier
Thomas Scialom
Gabriel Synnaeve
ELM
ALM
63
1,924
0
24 Aug 2023
Giraffe: Adventures in Expanding Context Lengths in LLMs
Giraffe: Adventures in Expanding Context Lengths in LLMs
Arka Pal
Deep Karkhanis
Manley Roberts
Samuel Dooley
Arvind Sundararajan
Siddartha Naidu
37
40
0
21 Aug 2023
LMTuner: An user-friendly and highly-integrable Training Framework for
  fine-tuning Large Language Models
LMTuner: An user-friendly and highly-integrable Training Framework for fine-tuning Large Language Models
Yixuan Weng
Zhiqi Wang
Huanxuan Liao
Shizhu He
Shengping Liu
Kang Liu
Jun Zhao
45
3
0
20 Aug 2023
MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain
  Conversation
MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation
Junru Lu
Siyu An
Mingbao Lin
Gabriele Pergola
Yulan He
Di Yin
Xing Sun
Yunsheng Wu
49
32
0
16 Aug 2023
OctoPack: Instruction Tuning Code Large Language Models
OctoPack: Instruction Tuning Code Large Language Models
Niklas Muennighoff
Qian Liu
A. Zebaze
Qinkai Zheng
Binyuan Hui
Terry Yue Zhuo
Swayam Singh
Xiangru Tang
Leandro von Werra
Shayne Longpre
VLM
ALM
71
120
0
14 Aug 2023
Local Large Language Models for Complex Structured Medical Tasks
Local Large Language Models for Complex Structured Medical Tasks
V. Bumgardner
Aaron D. Mullen
Samuel E. Armstrong
Caylin D. Hickey
Jeffrey A. Talbert
36
5
0
03 Aug 2023
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world
  APIs
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
Yujia Qin
Shi Liang
Yining Ye
Kunlun Zhu
Lan Yan
...
Jie Zhou
Mark B. Gerstein
Dahai Li
Zhiyuan Liu
Maosong Sun
CLL
ALM
LLMAG
ELM
LM&MA
87
629
0
31 Jul 2023
RLCD: Reinforcement Learning from Contrastive Distillation for Language
  Model Alignment
RLCD: Reinforcement Learning from Contrastive Distillation for Language Model Alignment
Kevin Kaichuang Yang
Dan Klein
Asli Celikyilmaz
Nanyun Peng
Yuandong Tian
ALM
41
30
0
24 Jul 2023
L-Eval: Instituting Standardized Evaluation for Long Context Language
  Models
L-Eval: Instituting Standardized Evaluation for Long Context Language Models
Chen An
Shansan Gong
Ming Zhong
Xingjian Zhao
Mukai Li
Jun Zhang
Lingpeng Kong
Xipeng Qiu
ELM
ALM
50
136
0
20 Jul 2023
A Comprehensive Overview of Large Language Models
A Comprehensive Overview of Large Language Models
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Ajmal Mian
OffRL
70
544
0
12 Jul 2023
PolyLM: An Open Source Polyglot Large Language Model
PolyLM: An Open Source Polyglot Large Language Model
Xiangpeng Wei
Hao-Ran Wei
Huan Lin
Tianhao Li
Pei Zhang
...
Yu Bowen
Dayiheng Liu
Baosong Yang
Fei Huang
Jun Xie
LRM
48
55
0
12 Jul 2023
Focused Transformer: Contrastive Training for Context Scaling
Focused Transformer: Contrastive Training for Context Scaling
Szymon Tworkowski
Konrad Staniszewski
Mikolaj Pacek
Yuhuai Wu
Henryk Michalewski
Piotr Milo's
39
136
0
06 Jul 2023
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large
  Foundation Models
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models
Shizhe Diao
Rui Pan
Hanze Dong
Kashun Shum
Jipeng Zhang
Wei Xiong
Tong Zhang
ALM
22
63
0
21 Jun 2023
Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca
Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca
Yiming Cui
Ziqing Yang
Xin Yao
ALM
34
298
0
17 Apr 2023
Pretrained Language Models for Text Generation: A Survey
Pretrained Language Models for Text Generation: A Survey
Junyi Li
Tianyi Tang
Wayne Xin Zhao
J. Nie
Ji-Rong Wen
AI4CE
36
128
0
14 Jan 2022
Train Short, Test Long: Attention with Linear Biases Enables Input
  Length Extrapolation
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
253
710
0
27 Aug 2021
Combiner: Full Attention Transformer with Sparse Computation Cost
Combiner: Full Attention Transformer with Sparse Computation Cost
Hongyu Ren
H. Dai
Zihang Dai
Mengjiao Yang
J. Leskovec
Dale Schuurmans
Bo Dai
87
77
0
12 Jul 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
282
2,007
0
31 Dec 2020
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
288
2,028
0
28 Jul 2020
Previous
12345678