Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,856 papers shown
Title
GREEK-BERT: The Greeks visiting Sesame Street
John Koutsikakis
Ilias Chalkidis
Prodromos Malakasiotis
Ion Androutsopoulos
70
92
0
27 Aug 2020
What is being transferred in transfer learning?
Behnam Neyshabur
Hanie Sedghi
Chiyuan Zhang
146
530
0
26 Aug 2020
Analysis and Evaluation of Language Models for Word Sense Disambiguation
Daniel Loureiro
Kiamehr Rezaee
Mohammad Taher Pilehvar
Jose Camacho-Collados
93
14
0
26 Aug 2020
A Baseline Analysis for Podcast Abstractive Summarization
Chujie Zheng
Harry J. Wang
Kunpeng Zhang
Ling Fan
35
12
0
24 Aug 2020
Example-Based Named Entity Recognition
Morteza Ziyadi
Yuting Sun
Abhishek Goswami
Jade Huang
Weizhu Chen
77
33
0
24 Aug 2020
End to End Dialogue Transformer
Ondrej Mekota
Memduh Gökirmak
Petr Laitoch
28
1
0
24 Aug 2020
PTT5: Pretraining and validating the T5 model on Brazilian Portuguese data
Diedre Carmo
Marcos Piau
Israel Campiotti
Rodrigo Nogueira
R. Lotufo
LM&MA
79
52
0
20 Aug 2020
Scruples: A Corpus of Community Ethical Judgments on 32,000 Real-Life Anecdotes
Nicholas Lourie
Ronan Le Bras
Yejin Choi
72
125
0
20 Aug 2020
Language Models as Knowledge Bases: On Entity Representations, Storage Capacity, and Paraphrased Queries
Benjamin Heinzerling
Kentaro Inui
KELM
68
133
0
20 Aug 2020
Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview
P. Bell
Joachim Fainberg
Ondˇrej Klejch
Jinyu Li
Steve Renals
P. Swietojanski
122
78
0
14 Aug 2020
Language Models as Few-Shot Learner for Task-Oriented Dialogue Systems
Andrea Madotto
Zihan Liu
Zhaojiang Lin
Pascale Fung
111
59
0
14 Aug 2020
The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models
Ian Tenney
James Wexler
Jasmijn Bastings
Tolga Bolukbasi
Andy Coenen
...
Ellen Jiang
Mahima Pushkarna
Carey Radebaugh
Emily Reif
Ann Yuan
VLM
130
196
0
12 Aug 2020
SemEval-2020 Task 10: Emphasis Selection for Written Text in Visual Media
Amirreza Shirani
Franck Dernoncourt
Nedim Lipka
P. Asente
J. Echevarria
Thamar Solorio
51
21
0
07 Aug 2020
aschern at SemEval-2020 Task 11: It Takes Three to Tango: RoBERTa, CRF, and Transfer Learning
Anton Chernyavskiy
Dmitry Ilvovsky
Preslav Nakov
44
25
0
06 Aug 2020
Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets
Patrick Lewis
Pontus Stenetorp
Sebastian Riedel
OOD
ELM
157
187
0
06 Aug 2020
Forecasting AI Progress: A Research Agenda
Ross Gruetzemacher
Florian E. Dorner
Niko Bernaola-Alvarez
Charlie Giattino
D. Manheim
AI4TS
46
32
0
04 Aug 2020
Taking Notes on the Fly Helps BERT Pre-training
Qiyu Wu
Chen Xing
Yatao Li
Guolin Ke
Di He
Tie-Yan Liu
56
10
0
04 Aug 2020
DeLighT: Deep and Light-weight Transformer
Sachin Mehta
Marjan Ghazvininejad
Srini Iyer
Luke Zettlemoyer
Hannaneh Hajishirzi
VLM
83
32
0
03 Aug 2020
Neural Language Generation: Formulation, Methods, and Evaluation
Cristina Garbacea
Qiaozhu Mei
158
30
0
31 Jul 2020
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing
Yu Gu
Robert Tinn
Hao Cheng
Michael R. Lucas
Naoto Usuyama
Xiaodong Liu
Tristan Naumann
Jianfeng Gao
Hoifung Poon
LM&MA
AI4CE
208
1,794
0
31 Jul 2020
Artificial Intelligence in the Battle against Coronavirus (COVID-19): A Survey and Future Research Directions
Thanh Thi Nguyen
Quoc Viet Hung Nguyen
Dung Nguyen
Samuel Yang
Peter W. Eklund
Thien Huynh-The
Thanh Tam Nguyen
Quoc-Viet Pham
Imran Razzak
Edbert B. Hsu
91
191
0
30 Jul 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
611
2,109
0
28 Jul 2020
SummEval: Re-evaluating Summarization Evaluation
Alexander R. Fabbri
Wojciech Kry'sciñski
Bryan McCann
Caiming Xiong
R. Socher
Dragomir R. Radev
HILM
140
724
0
24 Jul 2020
Multi-task learning for natural language processing in the 2020s: where are we going?
Joseph Worsham
Jugal Kalita
AIMat
73
81
0
22 Jul 2020
WordCraft: An Environment for Benchmarking Commonsense Agents
Minqi Jiang
Jelena Luketina
Nantas Nardelli
Pasquale Minervini
Philip Torr
Shimon Whiteson
Tim Rocktaschel
LLMAG
OffRL
49
23
0
17 Jul 2020
Compositional Generalization in Semantic Parsing: Pre-training vs. Specialized Architectures
Daniel Furrer
Marc van Zee
Nathan Scales
Nathanael Scharli
CoGe
97
114
0
17 Jul 2020
Investigating Pretrained Language Models for Graph-to-Text Generation
Leonardo F. R. Ribeiro
Martin Schmitt
Hinrich Schütze
Iryna Gurevych
99
218
0
16 Jul 2020
The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction
Alice Martin
Charles Ollion
Florian Strub
Sylvain Le Corff
Olivier Pietquin
55
6
0
15 Jul 2020
Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics
V. Ramasesh
Ethan Dyer
M. Raghu
CLL
107
179
0
14 Jul 2020
Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset
Edwin Zhang
Nikhil Gupta
Raphael Tang
Xiao Han
Ronak Pradeep
...
Yue Zhang
Rodrigo Nogueira
Kyunghyun Cho
Hui Fang
Jimmy J. Lin
83
59
0
14 Jul 2020
An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models
Lifu Tu
Garima Lalwani
Spandana Gella
He He
LRM
119
187
0
14 Jul 2020
ProtTrans: Towards Cracking the Language of Life's Code Through Self-Supervised Deep Learning and High Performance Computing
Ahmed Elnaggar
M. Heinzinger
Christian Dallago
Ghalia Rehawi
Yu Wang
...
Tamas B. Fehér
Christoph Angerer
Martin Steinegger
D. Bhowmik
B. Rost
DRL
80
967
0
13 Jul 2020
DART: Open-Domain Structured Data Record to Text Generation
Linyong Nan
Dragomir R. Radev
Rui Zhang
Amrit Rau
Abhinand Sivaprasad
...
Y. Tan
Xi Lin
Caiming Xiong
R. Socher
Nazneen Rajani
60
201
0
06 Jul 2020
Text Data Augmentation: Towards better detection of spear-phishing emails
Mehdi Regina
Maxime Meyer
S. Goutal
55
18
0
04 Jul 2020
Abstractive and mixed summarization for long-single documents
Roger Barrull
Jugal Kalita
37
0
0
03 Jul 2020
Language-agnostic BERT Sentence Embedding
Fangxiaoyu Feng
Yinfei Yang
Daniel Cer
N. Arivazhagan
Wei Wang
185
918
0
03 Jul 2020
Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering
Gautier Izacard
Edouard Grave
RALM
162
1,189
0
02 Jul 2020
DAPPLE: A Pipelined Data Parallel Approach for Training Large Models
Shiqing Fan
Yi Rong
Chen Meng
Zongyan Cao
Siyu Wang
...
Jun Yang
Lixue Xia
Lansong Diao
Xiaoyong Liu
Wei Lin
96
241
0
02 Jul 2020
Facts as Experts: Adaptable and Interpretable Neural Memory over Symbolic Knowledge
Pat Verga
Haitian Sun
Livio Baldini Soares
William W. Cohen
KELM
95
50
0
02 Jul 2020
Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
Denny Zhou
Mao Ye
Chen Chen
Tianjian Meng
Mingxing Tan
Xiaodan Song
Quoc V. Le
Qiang Liu
Dale Schuurmans
63
20
0
01 Jul 2020
A Survey on Self-supervised Pre-training for Sequential Transfer Learning in Neural Networks
H. H. Mao
BDL
SSL
72
50
0
01 Jul 2020
Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering
Ben Bogin
Sanjay Subramanian
Matt Gardner
Jonathan Berant
ReLM
OOD
BDL
LRM
55
28
0
01 Jul 2020
Transferability of Natural Language Inference to Biomedical Question Answering
Minbyul Jeong
Mujeen Sung
Gangwoo Kim
Donghyeon Kim
Wonjin Yoon
J. Yoo
Jaewoo Kang
80
40
0
01 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
139
135
0
30 Jun 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Zhiwen Chen
MoE
171
1,198
0
30 Jun 2020
Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization
Sang Michael Xie
Tengyu Ma
Percy Liang
121
15
0
29 Jun 2020
Answering Questions on COVID-19 in Real-Time
Jinhyuk Lee
Sean S. Yi
Minbyul Jeong
Mujeen Sung
Wonjin Yoon
Yonghwa Choi
Miyoung Ko
Jaewoo Kang
80
43
0
29 Jun 2020
Knowledge-Aware Language Model Pretraining
Corby Rosset
Chenyan Xiong
M. Phan
Xia Song
Paul N. Bennett
Saurabh Tiwary
KELM
94
83
0
29 Jun 2020
Rethinking Positional Encoding in Language Pre-training
Guolin Ke
Di He
Tie-Yan Liu
138
299
0
28 Jun 2020
BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision
Chen Liang
Yue Yu
Haoming Jiang
Siawpeng Er
Ruijia Wang
T. Zhao
Chao Zhang
OffRL
78
240
0
28 Jun 2020
Previous
1
2
3
...
192
193
194
...
196
197
198
Next