Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.01490
Cited By
v1
v2 (latest)
Extending Context Window of Large Language Models from a Distributional Perspective
2 October 2024
Yingsheng Wu
Yuxuan Gu
Xiaocheng Feng
Weihong Zhong
Dongliang Xu
Qing Yang
Hongtao Liu
Bing Qin
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Extending Context Window of Large Language Models from a Distributional Perspective"
3 / 3 papers shown
Title
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
336
775
0
27 Aug 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
288
2,521
0
20 Apr 2021
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
Samyam Rajbhandari
Jeff Rasley
Olatunji Ruwase
Yuxiong He
ALM
AI4CE
82
916
0
04 Oct 2019
1