Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.15497
Cited By
LSG Attention: Extrapolation of pretrained Transformers to long sequences
13 October 2022
Charles Condevaux
S. Harispe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LSG Attention: Extrapolation of pretrained Transformers to long sequences"
6 / 6 papers shown
Title
Wormhole Memory: A Rubik's Cube for Cross-Dialogue Retrieval
Libo Wang
117
0
0
24 Jan 2025
Fovea Transformer: Efficient Long-Context Modeling with Structured Fine-to-Coarse Attention
Ziwei He
Jian Yuan
Le Zhou
Jingwen Leng
Bo Jiang
32
1
0
13 Nov 2023
PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization
Wen Xiao
Iz Beltagy
Giuseppe Carenini
Arman Cohan
CVBM
83
114
0
16 Oct 2021
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
Ilias Chalkidis
Abhik Jana
D. Hartung
M. Bommarito
Ion Androutsopoulos
Daniel Martin Katz
Nikolaos Aletras
AILaw
ELM
130
248
0
03 Oct 2021
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
253
695
0
27 Aug 2021
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
285
2,015
0
28 Jul 2020
1