ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,870 papers shown
Title
Residual Mixture of Experts
Residual Mixture of Experts
Lemeng Wu
Mengchen Liu
Yinpeng Chen
Dongdong Chen
Xiyang Dai
Lu Yuan
MoE
113
37
0
20 Apr 2022
A Corpus for Understanding and Generating Moral Stories
A Corpus for Understanding and Generating Moral Stories
Jian Guan
Ziqi Liu
Minlie Huang
75
10
0
20 Apr 2022
Probing for the Usage of Grammatical Number
Probing for the Usage of Grammatical Number
Karim Lasri
Tiago Pimentel
Alessandro Lenci
Thierry Poibeau
Ryan Cotterell
80
58
0
19 Apr 2022
On The Cross-Modal Transfer from Natural Language to Code through
  Adapter Modules
On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules
Divyam Goel
Raman Grover
Fatemeh H. Fard
76
19
0
19 Apr 2022
StableMoE: Stable Routing Strategy for Mixture of Experts
StableMoE: Stable Routing Strategy for Mixture of Experts
Damai Dai
Li Dong
Shuming Ma
Bo Zheng
Zhifang Sui
Baobao Chang
Furu Wei
MoE
73
66
0
18 Apr 2022
UTNLP at SemEval-2022 Task 6: A Comparative Analysis of Sarcasm
  Detection Using Generative-based and Mutation-based Data Augmentation
UTNLP at SemEval-2022 Task 6: A Comparative Analysis of Sarcasm Detection Using Generative-based and Mutation-based Data Augmentation
Amirhossein Abaskohi
A. Rasouli
Tanin Zeraati
B. Bahrak
63
11
0
18 Apr 2022
A Study on Prompt-based Few-Shot Learning Methods for Belief State
  Tracking in Task-oriented Dialog Systems
A Study on Prompt-based Few-Shot Learning Methods for Belief State Tracking in Task-oriented Dialog Systems
Debjoy Saha
Bishal Santra
Pawan Goyal
33
2
0
18 Apr 2022
Back to the Future: Bidirectional Information Decoupling Network for
  Multi-turn Dialogue Modeling
Back to the Future: Bidirectional Information Decoupling Network for Multi-turn Dialogue Modeling
Yiyang Li
Hai Zhao
Zhuosheng Zhang
57
11
0
18 Apr 2022
Language Contamination Helps Explain the Cross-lingual Capabilities of
  English Pretrained Models
Language Contamination Helps Explain the Cross-lingual Capabilities of English Pretrained Models
Terra Blevins
Luke Zettlemoyer
151
92
0
17 Apr 2022
ArcaneQA: Dynamic Program Induction and Contextualized Encoding for
  Knowledge Base Question Answering
ArcaneQA: Dynamic Program Induction and Contextualized Encoding for Knowledge Base Question Answering
Yu Gu
Yu-Chuan Su
93
73
0
17 Apr 2022
Unsupervised Cross-Task Generalization via Retrieval Augmentation
Unsupervised Cross-Task Generalization via Retrieval Augmentation
Bill Yuchen Lin
Kangmin Tan
Chris Miller
Beiwen Tian
Xiang Ren
LRMRALM
84
49
0
17 Apr 2022
On the Origin of Hallucinations in Conversational Models: Is it the
  Datasets or the Models?
On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?
Nouha Dziri
Sivan Milton
Mo Yu
Osmar Zaiane
Siva Reddy
HILM
74
195
0
17 Apr 2022
Bridging Cross-Lingual Gaps During Leveraging the Multilingual
  Sequence-to-Sequence Pretraining for Text Generation and Understanding
Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation and Understanding
Changtong Zan
Liang Ding
Li Shen
Yu Cao
Weifeng Liu
Dacheng Tao
LRM
103
8
0
16 Apr 2022
A Contrastive Cross-Channel Data Augmentation Framework for Aspect-based
  Sentiment Analysis
A Contrastive Cross-Channel Data Augmentation Framework for Aspect-based Sentiment Analysis
Bing Wang
Liang Ding
Qihuang Zhong
Ximing Li
Dacheng Tao
83
33
0
16 Apr 2022
A Hierarchical N-Gram Framework for Zero-Shot Link Prediction
A Hierarchical N-Gram Framework for Zero-Shot Link Prediction
Mingchen Li
Jiasi Chen
Samuel Mensah
Nikolaos Aletras
Xiulong Yang
Yang Ye
113
14
0
16 Apr 2022
WordAlchemy: A transformer-based Reverse Dictionary
WordAlchemy: A transformer-based Reverse Dictionary
S. Mane
Harshali B. Patil
Kanhaiya Madaswar
Pranav Sadavarte
105
5
0
16 Apr 2022
Probing Script Knowledge from Pre-Trained Models
Probing Script Knowledge from Pre-Trained Models
Zijian Jin
Xingyu Zhang
Mo Yu
Lifu Huang
60
5
0
16 Apr 2022
Super-NaturalInstructions: Generalization via Declarative Instructions
  on 1600+ NLP Tasks
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Yizhong Wang
Swaroop Mishra
Pegah Alipoormolabashi
Yeganeh Kordi
Amirreza Mirzaei
...
Chitta Baral
Yejin Choi
Noah A. Smith
Hannaneh Hajishirzi
Daniel Khashabi
ELM
133
864
0
16 Apr 2022
Calibrating Trust of Multi-Hop Question Answering Systems with
  Decompositional Probes
Calibrating Trust of Multi-Hop Question Answering Systems with Decompositional Probes
Kaige Xie
Sarah Wiegreffe
Mark O. Riedl
ReLM
86
12
0
16 Apr 2022
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided
  Adaptation
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation
Simiao Zuo
Qingru Zhang
Chen Liang
Pengcheng He
T. Zhao
Weizhu Chen
MoE
195
41
0
15 Apr 2022
CILDA: Contrastive Data Augmentation using Intermediate Layer Knowledge
  Distillation
CILDA: Contrastive Data Augmentation using Intermediate Layer Knowledge Distillation
Md. Akmal Haidar
Mehdi Rezagholizadeh
Abbas Ghaddar
Khalil Bibi
Philippe Langlais
Pascal Poupart
CLL
77
7
0
15 Apr 2022
Evaluating Factuality in Text Simplification
Evaluating Factuality in Text Simplification
Ashwin Devaraj
William Sheffield
Byron C. Wallace
Junyi Jessy Li
HILM
83
43
0
15 Apr 2022
Improving Passage Retrieval with Zero-Shot Question Generation
Improving Passage Retrieval with Zero-Shot Question Generation
Devendra Singh Sachan
M. Lewis
Mandar Joshi
Armen Aghajanyan
Wen-tau Yih
J. Pineau
Luke Zettlemoyer
OODLRM
130
167
0
15 Apr 2022
mGPT: Few-Shot Learners Go Multilingual
mGPT: Few-Shot Learners Go Multilingual
Oleh Shliazhko
Alena Fenogenova
Maria Tikhonova
Vladislav Mikhailov
Anastasia Kozlova
Tatiana Shavrina
128
155
0
15 Apr 2022
Stretching Sentence-pair NLI Models to Reason over Long Documents and
  Clusters
Stretching Sentence-pair NLI Models to Reason over Long Documents and Clusters
Tal Schuster
Sihao Chen
S. Buthpitiya
Alex Fabrikant
Donald Metzler
101
41
0
15 Apr 2022
ML_LTU at SemEval-2022 Task 4: T5 Towards Identifying Patronizing and
  Condescending Language
ML_LTU at SemEval-2022 Task 4: T5 Towards Identifying Patronizing and Condescending Language
Tosin Adewumi
Lama Alkhaled
Hamam Alkhaled
F. Liwicki
Marcus Liwicki
36
5
0
15 Apr 2022
Towards Fine-grained Causal Reasoning and QA
Towards Fine-grained Causal Reasoning and QA
Linyi Yang
Zhen Wang
Yuxiang Wu
Jie Yang
Yue Zhang
88
16
0
15 Apr 2022
On the Role of Pre-trained Language Models in Word Ordering: A Case
  Study with BART
On the Role of Pre-trained Language Models in Word Ordering: A Case Study with BART
Zebin Ou
Meishan Zhang
Yue Zhang
50
3
0
15 Apr 2022
Identifying and Measuring Token-Level Sentiment Bias in Pre-trained
  Language Models with Prompts
Identifying and Measuring Token-Level Sentiment Bias in Pre-trained Language Models with Prompts
Apoorv Garg
Deval Srivastava
Zhiyang Xu
Lifu Huang
21
5
0
15 Apr 2022
CLUES: A Benchmark for Learning Classifiers using Natural Language
  Explanations
CLUES: A Benchmark for Learning Classifiers using Natural Language Explanations
Rakesh R Menon
Sayan Ghosh
Shashank Srivastava
LRMELM
103
11
0
14 Apr 2022
Exploring Dual Encoder Architectures for Question Answering
Exploring Dual Encoder Architectures for Question Answering
Zhe Dong
Jianmo Ni
Daniel M. Bikel
Enrique Alfonseca
Yuanjin Wang
Chen Qu
I. Zitouni
65
18
0
14 Apr 2022
Composite Code Sparse Autoencoders for first stage retrieval
Composite Code Sparse Autoencoders for first stage retrieval
Carlos Lassance
Thibault Formal
Stéphane Clinchant
65
4
0
14 Apr 2022
A Unified Multi-task Learning Framework for Multi-goal Conversational
  Recommender Systems
A Unified Multi-task Learning Framework for Multi-goal Conversational Recommender Systems
Yang Deng
Wenxuan Zhang
Weiwen Xu
Wenqiang Lei
Tat-Seng Chua
W. Lam
LRM
104
66
0
14 Apr 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
189
841
0
14 Apr 2022
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding
  Language Models with Model Generated Signals
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals
Payal Bajaj
Chenyan Xiong
Guolin Ke
Xiaodong Liu
Di He
Saurabh Tiwary
Tie-Yan Liu
Paul N. Bennett
Xia Song
Jianfeng Gao
118
32
0
13 Apr 2022
EHRKit: A Python Natural Language Processing Toolkit for Electronic
  Health Record Texts
EHRKit: A Python Natural Language Processing Toolkit for Electronic Health Record Texts
Irene Li
Keen You
Yujie Qiao
Lucas Huang
Chia-Chun Hsieh
Benjamin Rosand
Xiangru Tang
Dragomir R. Radev
89
4
0
13 Apr 2022
Scalable Training of Language Models using JAX pjit and TPUv4
Scalable Training of Language Models using JAX pjit and TPUv4
Joanna Yoo
Kuba Perlin
Siddhartha Rao Kamalakara
J. Araújo
VLM
71
10
0
13 Apr 2022
ASQA: Factoid Questions Meet Long-Form Answers
ASQA: Factoid Questions Meet Long-Form Answers
Ivan Stelmakh
Yi Luan
Bhuwan Dhingra
Ming-Wei Chang
77
178
0
12 Apr 2022
A Review on Language Models as Knowledge Bases
A Review on Language Models as Knowledge Bases
Badr AlKhamissi
Millicent Li
Asli Celikyilmaz
Mona T. Diab
Marjan Ghazvininejad
KELM
103
187
0
12 Apr 2022
InCoder: A Generative Model for Code Infilling and Synthesis
InCoder: A Generative Model for Code Infilling and Synthesis
Daniel Fried
Armen Aghajanyan
Jessy Lin
Sida I. Wang
Eric Wallace
Freda Shi
Ruiqi Zhong
Wen-tau Yih
Luke Zettlemoyer
M. Lewis
SyDa
125
659
0
12 Apr 2022
What Language Model Architecture and Pretraining Objective Work Best for
  Zero-Shot Generalization?
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
Thomas Wang
Adam Roberts
Daniel Hesslow
Teven Le Scao
Hyung Won Chung
Iz Beltagy
Julien Launay
Colin Raffel
140
176
0
12 Apr 2022
NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning
  Tasks
NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks
Swaroop Mishra
Arindam Mitra
Neeraj Varshney
Bhavdeep Singh Sachdeva
Peter Clark
Chitta Baral
Ashwin Kalyan
AIMatReLMELMLRM
98
110
0
12 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
97
9
0
11 Apr 2022
Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0
Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0
F. Toni
Christopher Akiki
Javier de la Rosa
Clémentine Fourrier
Enrique Manjavacas
Stefan Schweter
Daniel Alexander van Strien
73
11
0
11 Apr 2022
NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias
NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias
Nayeon Lee
Yejin Bang
Tiezheng Yu
Andrea Madotto
Pascale Fung
58
29
0
11 Apr 2022
Explanation Graph Generation via Pre-trained Language Models: An
  Empirical Study with Contrastive Learning
Explanation Graph Generation via Pre-trained Language Models: An Empirical Study with Contrastive Learning
Swarnadeep Saha
Prateek Yadav
Joey Tianyi Zhou
63
9
0
11 Apr 2022
DualPrompt: Complementary Prompting for Rehearsal-free Continual
  Learning
DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning
Zifeng Wang
Zizhao Zhang
Sayna Ebrahimi
Ruoxi Sun
Han Zhang
...
Xiaoqi Ren
Guolong Su
Vincent Perot
Jennifer Dy
Tomas Pfister
CLLVLMVPVLM
132
504
0
10 Apr 2022
UniDU: Towards A Unified Generative Dialogue Understanding Framework
UniDU: Towards A Unified Generative Dialogue Understanding Framework
Zhi Chen
Lu Chen
B. Chen
Libo Qin
Yuncong Liu
Su Zhu
Jian-Guang Lou
Kai Yu
98
13
0
10 Apr 2022
Decay No More: A Persistent Twitter Dataset for Learning Social Meaning
Decay No More: A Persistent Twitter Dataset for Learning Social Meaning
Chiyu Zhang
Muhammad Abdul-Mageed
El Moatez Billah Nagoudi
105
3
0
10 Apr 2022
Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained
  Language Models For Classification Tasks
Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained Language Models For Classification Tasks
Haoran Yang
Piji Li
Wai Lam
77
4
0
10 Apr 2022
Previous
123...162163164...196197198
Next