Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.15485
Cited By
Enhancing RWKV-based Language Models for Long-Sequence Text Generation
21 February 2025
Xinghan Pan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Enhancing RWKV-based Language Models for Long-Sequence Text Generation"
9 / 9 papers shown
Title
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
165
585
0
22 May 2023
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
484
2,051
0
28 Jul 2020
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
Angelos Katharopoulos
Apoorv Vyas
Nikolaos Pappas
Franccois Fleuret
121
1,734
0
29 Jun 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
100
3,996
0
10 Apr 2020
Adaptive Attention Span in Transformers
Sainbayar Sukhbaatar
Edouard Grave
Piotr Bojanowski
Armand Joulin
63
285
0
19 May 2019
BERTScore: Evaluating Text Generation with BERT
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
201
5,668
0
21 Apr 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
140
3,714
0
09 Jan 2019
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
453
129,831
0
12 Jun 2017
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
254
10,412
0
21 Jul 2016
1