Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,870 papers shown
Title
SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising
K. Xuan
Yongbo Wang
Yongliang Wang
Zujie Wen
Yang Dong
VLM
76
54
0
17 May 2021
Doc2Dict: Information Extraction as Text Generation
Benjamin Townsend
Eamon Ito-Fisher
Lily Zhang
Madison May
56
7
0
16 May 2021
EASE: Extractive-Abstractive Summarization with Explanations
Haoran Li
Arash Einolghozati
Srini Iyer
Bhargavi Paranjape
Yashar Mehdad
Sonal Gupta
Marjan Ghazvininejad
48
10
0
14 May 2021
A cost-benefit analysis of cross-lingual transfer methods
G. Rosa
L. Bonifacio
Leandro Rodrigues de Souza
R. Lotufo
Rodrigo Nogueira
50
12
0
14 May 2021
RetGen: A Joint framework for Retrieval and Grounded Text Generation Modeling
Yizhe Zhang
Siqi Sun
Xiang Gao
Yuwei Fang
Chris Brockett
Michel Galley
Jianfeng Gao
Bill Dolan
RALM
106
34
0
14 May 2021
Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters
Yan Xu
Etsuko Ishii
Samuel Cahyawijaya
Zihan Liu
Genta Indra Winata
Andrea Madotto
Dan Su
Pascale Fung
RALM
55
45
0
13 May 2021
Are Larger Pretrained Language Models Uniformly Better? Comparing Performance at the Instance Level
Ruiqi Zhong
Dhruba Ghosh
Dan Klein
Jacob Steinhardt
91
36
0
13 May 2021
Improving Code Autocompletion with Transfer Learning
Wenjie Zhou
Seohyun Kim
V. Murali
Gareth Ari Aye
112
15
0
12 May 2021
Go Beyond Plain Fine-tuning: Improving Pretrained Models for Social Commonsense
Ting-Yun Chang
Yang Liu
Karthik Gopalakrishnan
Behnam Hedayatnia
Pei Zhou
Dilek Z. Hakkani-Tür
ReLM
VLM
AI4MH
LRM
18
1
0
12 May 2021
How Reliable are Model Diagnostics?
V. Aribandi
Yi Tay
Donald Metzler
103
19
0
12 May 2021
Incorporating Commonsense Knowledge Graph in Pretrained Models for Social Commonsense Tasks
Ting-Yun Chang
Yang Liu
Karthik Gopalakrishnan
Behnam Hedayatnia
Pei Zhou
Dilek Z. Hakkani-Tür
ReLM
AI4MH
LRM
84
37
0
12 May 2021
Addressing "Documentation Debt" in Machine Learning Research: A Retrospective Datasheet for BookCorpus
Jack Bandy
Nicholas Vincent
69
57
0
11 May 2021
EL-Attention: Memory Efficient Lossless Attention for Generation
Yu Yan
Jiusheng Chen
Weizhen Qi
Nikhil Bhendawade
Yeyun Gong
Nan Duan
Ruofei Zhang
VLM
68
6
0
11 May 2021
T-EMDE: Sketching-based global similarity for cross-modal retrieval
Barbara Rychalska
Mikolaj Wieczorek
Jacek Dąbrowski
59
0
0
10 May 2021
Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue State Tracking
Zhaojiang Lin
Bing-Quan Liu
Seungwhan Moon
Paul A. Crook
Zhenpeng Zhou
Zhiguang Wang
Zhou Yu
Andrea Madotto
Eunjoon Cho
R. Subba
73
91
0
10 May 2021
MS MARCO: Benchmarking Ranking Models in the Large-Data Regime
Nick Craswell
Bhaskar Mitra
Emine Yilmaz
Daniel Fernando Campos
Jimmy J. Lin
ALM
83
66
0
09 May 2021
Continual Mixed-Language Pre-Training for Extremely Low-Resource Neural Machine Translation
Zihan Liu
Genta Indra Winata
Pascale Fung
VLM
CLL
96
54
0
09 May 2021
Which transformer architecture fits my data? A vocabulary bottleneck in self-attention
Noam Wies
Yoav Levine
Daniel Jannai
Amnon Shashua
92
20
0
09 May 2021
Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents
Chaojun Xiao
Xueyu Hu
Zhiyuan Liu
Cunchao Tu
Maosong Sun
AILaw
ELM
104
245
0
09 May 2021
FNet: Mixing Tokens with Fourier Transforms
James Lee-Thorp
Joshua Ainslie
Ilya Eckstein
Santiago Ontanon
134
536
0
09 May 2021
Long-Span Summarization via Local Attention and Content Selection
Potsawee Manakul
Mark Gales
87
42
0
08 May 2021
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality
Adithya Ganesan
Matthew Matero
Aravind Reddy Ravula
Huy-Hien Vu
H. Andrew Schwartz
90
35
0
07 May 2021
Are Pre-trained Convolutions Better than Pre-trained Transformers?
Yi Tay
Mostafa Dehghani
J. Gupta
Dara Bahri
V. Aribandi
Zhen Qin
Donald Metzler
AI4CE
77
49
0
07 May 2021
A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers
Pradeep Dasigi
Kyle Lo
Iz Beltagy
Arman Cohan
Noah A. Smith
Matt Gardner
RALM
134
311
0
07 May 2021
Learning to Perturb Word Embeddings for Out-of-distribution QA
Seanie Lee
Minki Kang
Juho Lee
Sung Ju Hwang
OOD
108
19
0
06 May 2021
A Novel Estimator of Mutual Information for Learning to Disentangle Textual Representations
Pierre Colombo
Chloé Clavel
Pablo Piantanida
AAML
DRL
183
51
0
06 May 2021
GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph
Junhan Yang
Zheng Liu
Shitao Xiao
Chaozhuo Li
Defu Lian
Sanjay Agrawal
Amit Singh
Guangzhong Sun
Xing Xie
AI4CE
86
160
0
06 May 2021
Towards General Natural Language Understanding with Probabilistic Worldbuilding
Abulhair Saparov
Tom Michael Mitchell
98
6
0
06 May 2021
Rethinking Search: Making Domain Experts out of Dilettantes
Donald Metzler
Yi Tay
Dara Bahri
Marc Najork
LRM
103
47
0
05 May 2021
HerBERT: Efficiently Pretrained Transformer-based Language Model for Polish
Robert Mroczkowski
Piotr Rybak
Alina Wróblewska
Ireneusz Gawlik
86
85
0
04 May 2021
Goldilocks: Just-Right Tuning of BERT for Technology-Assisted Review
Eugene Yang
Sean MacAvaney
D. Lewis
O. Frieder
126
29
0
03 May 2021
Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark
Nouha Dziri
Hannah Rashkin
Tal Linzen
David Reitter
ALM
317
83
0
30 Apr 2021
Entailment as Few-Shot Learner
Sinong Wang
Han Fang
Madian Khabsa
Hanzi Mao
Hao Ma
100
184
0
29 Apr 2021
A First Look: Towards Explainable TextVQA Models via Visual and Textual Explanations
Varun Nagaraj Rao
Xingjian Zhen
K. Hovsepian
Mingwei Shen
97
19
0
29 Apr 2021
MAGMA: An Optimization Framework for Mapping Multiple DNNs on Multiple Accelerator Cores
Sheng-Chun Kao
T. Krishna
120
52
0
28 Apr 2021
Morph Call: Probing Morphosyntactic Content of Multilingual Transformers
Vladislav Mikhailov
O. Serikov
Ekaterina Artemova
82
9
0
26 Apr 2021
Focused Attention Improves Document-Grounded Generation
Shrimai Prabhumoye
Kazuma Hashimoto
Yingbo Zhou
A. Black
Ruslan Salakhutdinov
225
41
0
26 Apr 2021
What Makes a Message Persuasive? Identifying Adaptations Towards Persuasiveness in Nine Exploratory Case Studies
Sebastian Duerr
Krystian Teodor Lange
P. Gloor
23
2
0
26 Apr 2021
PanGu-
α
α
α
: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Wei Zeng
Xiaozhe Ren
Teng Su
Hui Wang
Yi-Lun Liao
...
Gaojun Fan
Yaowei Wang
Xuefeng Jin
Qun Liu
Yonghong Tian
ALM
MoE
AI4CE
80
214
0
26 Apr 2021
XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis and Beyond
Francesco Barbieri
Luis Espinosa Anke
Jose Camacho-Collados
253
227
0
25 Apr 2021
Extract then Distill: Efficient and Effective Task-Agnostic BERT Distillation
Cheng Chen
Yichun Yin
Lifeng Shang
Zhi Wang
Xin Jiang
Xiao Chen
Qun Liu
FedML
75
7
0
24 Apr 2021
A Survey of Modern Deep Learning based Object Detection Models
Syed Sahil Abbas Zaidi
M. S. Ansari
Asra Aslam
N. Kanwal
M. Asghar
Brian Lee
VLM
ObjD
155
760
0
24 Apr 2021
Generating abstractive summaries of Lithuanian news articles using a transformer model
Lukas Stankevicius
M. Lukoševičius
50
3
0
23 Apr 2021
Towards Accurate Text-based Image Captioning with Content Diversity Exploration
Guanghui Xu
Shuaicheng Niu
Mingkui Tan
Yucheng Luo
Qing Du
Qi Wu
DiffM
84
58
0
23 Apr 2021
Automated News Summarization Using Transformers
Anushka Gupta
Diksha Chugh
Anjum
R. Katarya
42
65
0
23 Apr 2021
Transfer training from smaller language model
Han Zhang
54
0
0
23 Apr 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Huayu Chen
Boqing Gong
ViT
368
594
0
22 Apr 2021
Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?
William Merrill
Yoav Goldberg
Roy Schwartz
Noah A. Smith
103
69
0
22 Apr 2021
Carbon Emissions and Large Neural Network Training
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
345
688
0
21 Apr 2021
Efficient Retrieval Optimized Multi-task Learning
He Fun
S. Gandhi
Sujith Ravi
RALM
74
6
0
20 Apr 2021
Previous
1
2
3
...
182
183
184
...
196
197
198
Next