Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,870 papers shown
Title
LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning
Yuhuai Wu
M. Rabe
Wenda Li
Jimmy Ba
Roger C. Grosse
Christian Szegedy
AIMat
LRM
142
57
0
15 Jan 2021
Structured Prediction as Translation between Augmented Natural Languages
Giovanni Paolini
Ben Athiwaratkun
Jason Krone
Jie Ma
Alessandro Achille
Rishita Anubhai
Cicero Nogueira dos Santos
Bing Xiang
Stefano Soatto
90
295
0
14 Jan 2021
The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models
Ronak Pradeep
Rodrigo Nogueira
Jimmy J. Lin
MoE
132
172
0
14 Jan 2021
Improving Commonsense Causal Reasoning by Adversarial Training and Data Augmentation
Ieva Staliunaite
P. Gorinski
Ignacio Iacobacci
LRM
277
22
0
13 Jan 2021
Robustness Gym: Unifying the NLP Evaluation Landscape
Karan Goel
Nazneen Rajani
Jesse Vig
Samson Tan
Jason M. Wu
Stephan Zheng
Caiming Xiong
Joey Tianyi Zhou
Christopher Ré
AAML
OffRL
OOD
199
140
0
13 Jan 2021
Of Non-Linearity and Commutativity in BERT
Sumu Zhao
Damian Pascual
Gino Brunner
Roger Wattenhofer
103
17
0
12 Jan 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
124
2,247
0
11 Jan 2021
BERT & Family Eat Word Salad: Experiments with Text Understanding
Ashim Gupta
Giorgi Kvernadze
Vivek Srikumar
258
73
0
10 Jan 2021
Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs
Wen-Yi Hsiao
Jen-Yu Liu
Yin-Cheng Yeh
Yi-Hsuan Yang
193
187
0
07 Jan 2021
Exploring Text-transformers in AAAI 2021 Shared Task: COVID-19 Fake News Detection in English
Xiangyang Li
Yu Xia
Xiang Long
Zheng Li
Sujian Li
258
37
0
07 Jan 2021
TextBox: A Unified, Modularized, and Extensible Framework for Text Generation
Junyi Li
Tianyi Tang
Gaole He
Jinhao Jiang
Xiaoxuan Hu
Puzhao Xie
Zhipeng Chen
Zhuohao Yu
Wayne Xin Zhao
Ji-Rong Wen
131
25
0
06 Jan 2021
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
179
354
0
05 Jan 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
385
2,560
0
04 Jan 2021
Retrieving and Reading: A Comprehensive Survey on Open-domain Question Answering
Fengbin Zhu
Wenqiang Lei
Chao Wang
Jianming Zheng
Soujanya Poria
Tat-Seng Chua
RALM
277
257
0
04 Jan 2021
Learning to Generate Task-Specific Adapters from Task Description
Qinyuan Ye
Xiang Ren
190
32
0
02 Jan 2021
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Wangchunshu Zhou
Tao Ge
Canwen Xu
Ke Xu
Furu Wei
LRM
83
16
0
02 Jan 2021
CDLM: Cross-Document Language Modeling
Avi Caciularu
Arman Cohan
Iz Beltagy
Matthew E. Peters
Arie Cattan
Ido Dagan
VLM
75
33
0
02 Jan 2021
Superbizarre Is Not Superb: Derivational Morphology Improves BERT's Interpretation of Complex Words
Valentin Hofmann
J. Pierrehumbert
Hinrich Schütze
120
72
0
02 Jan 2021
Learning to Emphasize: Dataset and Shared Task Models for Selecting Emphasis in Presentation Slides
Amirreza Shirani
Gia-Lac Tran
Hieu Trinh
Franck Dernoncourt
Nedim Lipka
P. Asente
J. Echevarria
Thamar Solorio
303
1
0
02 Jan 2021
Analyzing Commonsense Emergence in Few-shot Knowledge Models
Jeff Da
Ronan Le Bras
Ximing Lu
Yejin Choi
Antoine Bosselut
AI4MH
KELM
178
41
0
01 Jan 2021
Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models
Tongshuang Wu
Marco Tulio Ribeiro
Jeffrey Heer
Daniel S. Weld
139
251
0
01 Jan 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
254
4,330
0
01 Jan 2021
UnitedQA: A Hybrid Approach for Open Domain Question Answering
Hao Cheng
Yelong Shen
Xiaodong Liu
Pengcheng He
Weizhu Chen
Jianfeng Gao
110
56
0
01 Jan 2021
NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned
Sewon Min
Jordan L. Boyd-Graber
Chris Alberti
Danqi Chen
Eunsol Choi
...
Dmytro Okhonko
Michael Schlichtkrull
Sonal Gupta
Yashar Mehdad
Wen-tau Yih
81
62
0
01 Jan 2021
Multi-task Retrieval for Knowledge-Intensive Tasks
Jean Maillard
Vladimir Karpukhin
Fabio Petroni
Wen-tau Yih
Barlas Oğuz
Veselin Stoyanov
Gargi Ghosh
261
67
0
01 Jan 2021
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
Xiaohan Chen
Yu Cheng
Shuohang Wang
Zhe Gan
Zhangyang Wang
Jingjing Liu
131
100
0
31 Dec 2020
Studying Strategically: Learning to Mask for Closed-book QA
Qinyuan Ye
Belinda Z. Li
Sinong Wang
Benjamin Bolte
Hao Ma
Wen-tau Yih
Xiang Ren
Madian Khabsa
OffRL
79
12
0
31 Dec 2020
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
304
91
0
31 Dec 2020
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
Wenhui Wang
Hangbo Bao
Shaohan Huang
Li Dong
Furu Wei
MQ
127
274
0
31 Dec 2020
Evidence-based Factual Error Correction
James Thorne
Andreas Vlachos
KELM
OffRL
96
58
0
31 Dec 2020
Moral Stories: Situated Reasoning about Norms, Intents, Actions, and their Consequences
Denis Emelin
Ronan Le Bras
Jena D. Hwang
Maxwell Forbes
Yejin Choi
LRM
118
135
0
31 Dec 2020
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
435
1,984
0
31 Dec 2020
ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual Corpora
Ouyang Xuan
Shuohuan Wang
Chao Pang
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
144
102
0
31 Dec 2020
CoCoLM: COmplex COmmonsense Enhanced Language Model with Discourse Relations
Changlong Yu
Hongming Zhang
Yangqiu Song
Wilfred Ng
117
21
0
31 Dec 2020
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
Phillip Rust
Jonas Pfeiffer
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
166
256
0
31 Dec 2020
XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders
Shuming Ma
Jian Yang
Haoyang Huang
Zewen Chi
Li Dong
...
Akiko Eriguchi
Saksham Singhal
Xia Song
Arul Menezes
Furu Wei
LRM
85
33
0
31 Dec 2020
BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining
Weizhen Qi
Yeyun Gong
Jian Jiao
Yu Yan
Weizhu Chen
...
Houqiang Li
Jiusheng Chen
Ruofei Zhang
Ming Zhou
Nan Duan
101
46
0
31 Dec 2020
FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation
Kushal Lakhotia
Bhargavi Paranjape
Asish Ghoshal
Wen-tau Yih
Yashar Mehdad
Srini Iyer
63
28
0
31 Dec 2020
Directed Beam Search: Plug-and-Play Lexically Constrained Language Generation
Damian Pascual
Béni Egressy
Florian Bolli
Roger Wattenhofer
81
20
0
31 Dec 2020
Corrected CBOW Performs as well as Skip-gram
Ozan Irsoy
Adrian Benton
K. Stratos
SyDa
34
12
0
30 Dec 2020
Out of Order: How Important Is The Sequential Order of Words in a Sentence in Natural Language Understanding Tasks?
Thang M. Pham
Trung Bui
Long Mai
Anh Totti Nguyen
289
123
0
30 Dec 2020
A Memory Efficient Baseline for Open Domain Question Answering
Gautier Izacard
Fabio Petroni
Lucas Hosseini
Nicola De Cao
Sebastian Riedel
Edouard Grave
MQ
66
44
0
30 Dec 2020
Improving BERT with Syntax-aware Local Attention
Zhongli Li
Qingyu Zhou
Chao Li
Ke Xu
Yunbo Cao
102
44
0
30 Dec 2020
Reducing conversational agents' overconfidence through linguistic calibration
Sabrina J. Mielke
Arthur Szlam
Emily Dinan
Y-Lan Boureau
321
171
0
30 Dec 2020
Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision
Si Sun
Yingzhuo Qian
Zhenghao Liu
Chenyan Xiong
Kaitao Zhang
Jie Bao
Zhiyuan Liu
Paul N. Bennett
90
18
0
29 Dec 2020
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Yang Xu
Yiheng Xu
Tengchao Lv
Lei Cui
Furu Wei
...
D. Florêncio
Cha Zhang
Wanxiang Che
Min Zhang
Lidong Zhou
ViT
MLLM
237
522
0
29 Dec 2020
UniK-QA: Unified Representations of Structured and Unstructured Knowledge for Open-Domain Question Answering
Barlas Oğuz
Xilun Chen
Vladimir Karpukhin
Stanislav Peshterliev
Dmytro Okhonko
Michael Schlichtkrull
Sonal Gupta
Yashar Mehdad
Scott Yih
281
89
0
29 Dec 2020
Universal Sentence Representation Learning with Conditional Masked Language Model
Ziyi Yang
Yinfei Yang
Daniel Cer
Jax Law
Eric F. Darve
SSL
84
58
0
28 Dec 2020
Explaining NLP Models via Minimal Contrastive Editing (MiCE)
Alexis Ross
Ana Marasović
Matthew E. Peters
75
122
0
27 Dec 2020
ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic
Muhammad Abdul-Mageed
AbdelRahim Elmadany
El Moatez Billah Nagoudi
VLM
129
465
0
27 Dec 2020
Previous
1
2
3
...
187
188
189
...
196
197
198
Next