Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,870 papers shown
Title
Residual Mixture of Experts
Lemeng Wu
Mengchen Liu
Yinpeng Chen
Dongdong Chen
Xiyang Dai
Lu Yuan
MoE
113
37
0
20 Apr 2022
A Corpus for Understanding and Generating Moral Stories
Jian Guan
Ziqi Liu
Minlie Huang
75
10
0
20 Apr 2022
Probing for the Usage of Grammatical Number
Karim Lasri
Tiago Pimentel
Alessandro Lenci
Thierry Poibeau
Ryan Cotterell
80
58
0
19 Apr 2022
On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules
Divyam Goel
Raman Grover
Fatemeh H. Fard
76
19
0
19 Apr 2022
StableMoE: Stable Routing Strategy for Mixture of Experts
Damai Dai
Li Dong
Shuming Ma
Bo Zheng
Zhifang Sui
Baobao Chang
Furu Wei
MoE
73
66
0
18 Apr 2022
UTNLP at SemEval-2022 Task 6: A Comparative Analysis of Sarcasm Detection Using Generative-based and Mutation-based Data Augmentation
Amirhossein Abaskohi
A. Rasouli
Tanin Zeraati
B. Bahrak
63
11
0
18 Apr 2022
A Study on Prompt-based Few-Shot Learning Methods for Belief State Tracking in Task-oriented Dialog Systems
Debjoy Saha
Bishal Santra
Pawan Goyal
33
2
0
18 Apr 2022
Back to the Future: Bidirectional Information Decoupling Network for Multi-turn Dialogue Modeling
Yiyang Li
Hai Zhao
Zhuosheng Zhang
57
11
0
18 Apr 2022
Language Contamination Helps Explain the Cross-lingual Capabilities of English Pretrained Models
Terra Blevins
Luke Zettlemoyer
151
92
0
17 Apr 2022
ArcaneQA: Dynamic Program Induction and Contextualized Encoding for Knowledge Base Question Answering
Yu Gu
Yu-Chuan Su
93
73
0
17 Apr 2022
Unsupervised Cross-Task Generalization via Retrieval Augmentation
Bill Yuchen Lin
Kangmin Tan
Chris Miller
Beiwen Tian
Xiang Ren
LRM
RALM
84
49
0
17 Apr 2022
On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?
Nouha Dziri
Sivan Milton
Mo Yu
Osmar Zaiane
Siva Reddy
HILM
74
195
0
17 Apr 2022
Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation and Understanding
Changtong Zan
Liang Ding
Li Shen
Yu Cao
Weifeng Liu
Dacheng Tao
LRM
103
8
0
16 Apr 2022
A Contrastive Cross-Channel Data Augmentation Framework for Aspect-based Sentiment Analysis
Bing Wang
Liang Ding
Qihuang Zhong
Ximing Li
Dacheng Tao
83
33
0
16 Apr 2022
A Hierarchical N-Gram Framework for Zero-Shot Link Prediction
Mingchen Li
Jiasi Chen
Samuel Mensah
Nikolaos Aletras
Xiulong Yang
Yang Ye
113
14
0
16 Apr 2022
WordAlchemy: A transformer-based Reverse Dictionary
S. Mane
Harshali B. Patil
Kanhaiya Madaswar
Pranav Sadavarte
105
5
0
16 Apr 2022
Probing Script Knowledge from Pre-Trained Models
Zijian Jin
Xingyu Zhang
Mo Yu
Lifu Huang
60
5
0
16 Apr 2022
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Yizhong Wang
Swaroop Mishra
Pegah Alipoormolabashi
Yeganeh Kordi
Amirreza Mirzaei
...
Chitta Baral
Yejin Choi
Noah A. Smith
Hannaneh Hajishirzi
Daniel Khashabi
ELM
133
864
0
16 Apr 2022
Calibrating Trust of Multi-Hop Question Answering Systems with Decompositional Probes
Kaige Xie
Sarah Wiegreffe
Mark O. Riedl
ReLM
86
12
0
16 Apr 2022
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation
Simiao Zuo
Qingru Zhang
Chen Liang
Pengcheng He
T. Zhao
Weizhu Chen
MoE
195
41
0
15 Apr 2022
CILDA: Contrastive Data Augmentation using Intermediate Layer Knowledge Distillation
Md. Akmal Haidar
Mehdi Rezagholizadeh
Abbas Ghaddar
Khalil Bibi
Philippe Langlais
Pascal Poupart
CLL
77
7
0
15 Apr 2022
Evaluating Factuality in Text Simplification
Ashwin Devaraj
William Sheffield
Byron C. Wallace
Junyi Jessy Li
HILM
83
43
0
15 Apr 2022
Improving Passage Retrieval with Zero-Shot Question Generation
Devendra Singh Sachan
M. Lewis
Mandar Joshi
Armen Aghajanyan
Wen-tau Yih
J. Pineau
Luke Zettlemoyer
OOD
LRM
130
167
0
15 Apr 2022
mGPT: Few-Shot Learners Go Multilingual
Oleh Shliazhko
Alena Fenogenova
Maria Tikhonova
Vladislav Mikhailov
Anastasia Kozlova
Tatiana Shavrina
128
155
0
15 Apr 2022
Stretching Sentence-pair NLI Models to Reason over Long Documents and Clusters
Tal Schuster
Sihao Chen
S. Buthpitiya
Alex Fabrikant
Donald Metzler
101
41
0
15 Apr 2022
ML_LTU at SemEval-2022 Task 4: T5 Towards Identifying Patronizing and Condescending Language
Tosin Adewumi
Lama Alkhaled
Hamam Alkhaled
F. Liwicki
Marcus Liwicki
36
5
0
15 Apr 2022
Towards Fine-grained Causal Reasoning and QA
Linyi Yang
Zhen Wang
Yuxiang Wu
Jie Yang
Yue Zhang
88
16
0
15 Apr 2022
On the Role of Pre-trained Language Models in Word Ordering: A Case Study with BART
Zebin Ou
Meishan Zhang
Yue Zhang
50
3
0
15 Apr 2022
Identifying and Measuring Token-Level Sentiment Bias in Pre-trained Language Models with Prompts
Apoorv Garg
Deval Srivastava
Zhiyang Xu
Lifu Huang
21
5
0
15 Apr 2022
CLUES: A Benchmark for Learning Classifiers using Natural Language Explanations
Rakesh R Menon
Sayan Ghosh
Shashank Srivastava
LRM
ELM
103
11
0
14 Apr 2022
Exploring Dual Encoder Architectures for Question Answering
Zhe Dong
Jianmo Ni
Daniel M. Bikel
Enrique Alfonseca
Yuanjin Wang
Chen Qu
I. Zitouni
65
18
0
14 Apr 2022
Composite Code Sparse Autoencoders for first stage retrieval
Carlos Lassance
Thibault Formal
Stéphane Clinchant
65
4
0
14 Apr 2022
A Unified Multi-task Learning Framework for Multi-goal Conversational Recommender Systems
Yang Deng
Wenxuan Zhang
Weiwen Xu
Wenqiang Lei
Tat-Seng Chua
W. Lam
LRM
104
66
0
14 Apr 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
189
841
0
14 Apr 2022
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals
Payal Bajaj
Chenyan Xiong
Guolin Ke
Xiaodong Liu
Di He
Saurabh Tiwary
Tie-Yan Liu
Paul N. Bennett
Xia Song
Jianfeng Gao
118
32
0
13 Apr 2022
EHRKit: A Python Natural Language Processing Toolkit for Electronic Health Record Texts
Irene Li
Keen You
Yujie Qiao
Lucas Huang
Chia-Chun Hsieh
Benjamin Rosand
Xiangru Tang
Dragomir R. Radev
89
4
0
13 Apr 2022
Scalable Training of Language Models using JAX pjit and TPUv4
Joanna Yoo
Kuba Perlin
Siddhartha Rao Kamalakara
J. Araújo
VLM
71
10
0
13 Apr 2022
ASQA: Factoid Questions Meet Long-Form Answers
Ivan Stelmakh
Yi Luan
Bhuwan Dhingra
Ming-Wei Chang
77
178
0
12 Apr 2022
A Review on Language Models as Knowledge Bases
Badr AlKhamissi
Millicent Li
Asli Celikyilmaz
Mona T. Diab
Marjan Ghazvininejad
KELM
103
187
0
12 Apr 2022
InCoder: A Generative Model for Code Infilling and Synthesis
Daniel Fried
Armen Aghajanyan
Jessy Lin
Sida I. Wang
Eric Wallace
Freda Shi
Ruiqi Zhong
Wen-tau Yih
Luke Zettlemoyer
M. Lewis
SyDa
125
659
0
12 Apr 2022
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
Thomas Wang
Adam Roberts
Daniel Hesslow
Teven Le Scao
Hyung Won Chung
Iz Beltagy
Julien Launay
Colin Raffel
140
176
0
12 Apr 2022
NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks
Swaroop Mishra
Arindam Mitra
Neeraj Varshney
Bhavdeep Singh Sachdeva
Peter Clark
Chitta Baral
Ashwin Kalyan
AIMat
ReLM
ELM
LRM
98
110
0
12 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
97
9
0
11 Apr 2022
Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0
F. Toni
Christopher Akiki
Javier de la Rosa
Clémentine Fourrier
Enrique Manjavacas
Stefan Schweter
Daniel Alexander van Strien
73
11
0
11 Apr 2022
NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias
Nayeon Lee
Yejin Bang
Tiezheng Yu
Andrea Madotto
Pascale Fung
58
29
0
11 Apr 2022
Explanation Graph Generation via Pre-trained Language Models: An Empirical Study with Contrastive Learning
Swarnadeep Saha
Prateek Yadav
Joey Tianyi Zhou
63
9
0
11 Apr 2022
DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning
Zifeng Wang
Zizhao Zhang
Sayna Ebrahimi
Ruoxi Sun
Han Zhang
...
Xiaoqi Ren
Guolong Su
Vincent Perot
Jennifer Dy
Tomas Pfister
CLL
VLM
VPVLM
132
504
0
10 Apr 2022
UniDU: Towards A Unified Generative Dialogue Understanding Framework
Zhi Chen
Lu Chen
B. Chen
Libo Qin
Yuncong Liu
Su Zhu
Jian-Guang Lou
Kai Yu
98
13
0
10 Apr 2022
Decay No More: A Persistent Twitter Dataset for Learning Social Meaning
Chiyu Zhang
Muhammad Abdul-Mageed
El Moatez Billah Nagoudi
105
3
0
10 Apr 2022
Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained Language Models For Classification Tasks
Haoran Yang
Piji Li
Wai Lam
77
4
0
10 Apr 2022
Previous
1
2
3
...
162
163
164
...
196
197
198
Next