Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2008.07027
Cited By
Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size
16 August 2020
Davis Yoshida
Allyson Ettinger
Kevin Gimpel
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size"
4 / 4 papers shown
Title
How to Train Long-Context Language Models (Effectively)
Tianyu Gao
Alexander Wettig
Howard Yen
Danqi Chen
RALM
72
38
0
03 Oct 2024
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
53
1,088
0
08 Jun 2021
What the [MASK]? Making Sense of Language-Specific BERT Models
Debora Nozza
Federico Bianchi
Dirk Hovy
89
105
0
05 Mar 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1