ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXivPDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 8,987 papers shown
Title
Transflower: probabilistic autoregressive dance generation with
  multimodal attention
Transflower: probabilistic autoregressive dance generation with multimodal attention
Guillermo Valle Pérez
G. Henter
Jonas Beskow
A. Holzapfel
Pierre-Yves Oudeyer
Simon Alexanderson
30
42
0
25 Jun 2021
XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44
  Languages
XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages
Tahmid Hasan
Abhik Bhattacharjee
Md. Saiful Islam
Kazi Samin Mubasshir
Yuan-Fang Li
Yong-Bin Kang
M. Rahman
Rifat Shahriyar
37
344
0
25 Jun 2021
DeltaLM: Encoder-Decoder Pre-training for Language Generation and
  Translation by Augmenting Pretrained Multilingual Encoders
DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders
Shuming Ma
Li Dong
Shaohan Huang
Dongdong Zhang
Alexandre Muzio
Saksham Singhal
Hany Awadalla
Xia Song
Furu Wei
SLR
AI4CE
25
80
0
25 Jun 2021
Domain-Specific Pretraining for Vertical Search: Case Study on
  Biomedical Literature
Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature
Yu-Chiang Frank Wang
Jinchao Li
Tristan Naumann
Chenyan Xiong
Hao Cheng
...
Yang Qin
Eric Horvitz
Paul N. Bennett
Jianfeng Gao
Hoifung Poon
OOD
33
13
0
25 Jun 2021
Charformer: Fast Character Transformers via Gradient-based Subword
  Tokenization
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Yi Tay
Vinh Q. Tran
Sebastian Ruder
Jai Gupta
Hyung Won Chung
Dara Bahri
Zhen Qin
Simon Baumgartner
Cong Yu
Donald Metzler
51
153
0
23 Jun 2021
Learn to Resolve Conversational Dependency: A Consistency Training
  Framework for Conversational Question Answering
Learn to Resolve Conversational Dependency: A Consistency Training Framework for Conversational Question Answering
Gangwoo Kim
Hyunjae Kim
Jungsoo Park
Jaewoo Kang
36
38
0
22 Jun 2021
DocFormer: End-to-End Transformer for Document Understanding
DocFormer: End-to-End Transformer for Document Understanding
Srikar Appalaraju
Bhavan A. Jasani
Bhargava Urala Kota
Yusheng Xie
R. Manmatha
ViT
41
271
0
22 Jun 2021
BARTScore: Evaluating Generated Text as Text Generation
BARTScore: Evaluating Generated Text as Text Generation
Weizhe Yuan
Graham Neubig
Pengfei Liu
55
809
0
22 Jun 2021
GAIA: A Transfer Learning System of Object Detection that Fits Your
  Needs
GAIA: A Transfer Learning System of Object Detection that Fits Your Needs
Xingyuan Bu
Junran Peng
Junjie Yan
Tieniu Tan
Zhaoxiang Zhang
ObjD
VLM
31
53
0
21 Jun 2021
CPM-2: Large-scale Cost-effective Pre-trained Language Models
CPM-2: Large-scale Cost-effective Pre-trained Language Models
Zhengyan Zhang
Yuxian Gu
Xu Han
Shengqi Chen
Chaojun Xiao
...
Minlie Huang
Wentao Han
Yang Liu
Xiaoyan Zhu
Maosong Sun
MoE
37
86
0
20 Jun 2021
JointGT: Graph-Text Joint Representation Learning for Text Generation
  from Knowledge Graphs
JointGT: Graph-Text Joint Representation Learning for Text Generation from Knowledge Graphs
Pei Ke
Haozhe Ji
Yuanyuan Ran
Xin Cui
Liwei Wang
Linfeng Song
Xiaoyan Zhu
Minlie Huang
59
95
0
19 Jun 2021
Large-Scale Chemical Language Representations Capture Molecular
  Structure and Properties
Large-Scale Chemical Language Representations Capture Molecular Structure and Properties
Jerret Ross
Brian M. Belgodere
Vijil Chenthamarakshan
Inkit Padhi
Youssef Mroueh
Payel Das
AI4CE
32
274
0
17 Jun 2021
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis
  of Head and Prompt Tuning
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning
Colin Wei
Sang Michael Xie
Tengyu Ma
24
97
0
17 Jun 2021
Can I Be of Further Assistance? Using Unstructured Knowledge Access to
  Improve Task-oriented Conversational Modeling
Can I Be of Further Assistance? Using Unstructured Knowledge Access to Improve Task-oriented Conversational Modeling
Di Jin
Seokhwan Kim
Dilek Z. Hakkani-Tür
26
14
0
16 Jun 2021
Automatic Construction of Evaluation Suites for Natural Language
  Generation Datasets
Automatic Construction of Evaluation Suites for Natural Language Generation Datasets
Simon Mille
Kaustubh D. Dhole
Saad Mahamood
Laura Perez-Beltrachini
Varun Gangal
Mihir Kale
Emiel van Miltenburg
Sebastian Gehrmann
ELM
47
22
0
16 Jun 2021
Named Entity Recognition with Small Strongly Labeled and Large Weakly
  Labeled Data
Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data
Haoming Jiang
Danqing Zhang
Tianyu Cao
Bing Yin
T. Zhao
NoLa
30
44
0
16 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
68
2,751
0
15 Jun 2021
Communicating Natural Programs to Humans and Machines
Communicating Natural Programs to Humans and Machines
Samuel Acquaviva
Yewen Pu
Marta Kryven
Theo Sechopoulos
Catherine Wong
Gabrielle Ecanow
Maxwell Nye
Michael Henry Tessler
J. Tenenbaum
38
40
0
15 Jun 2021
Improving Paraphrase Detection with the Adversarial Paraphrasing Task
Improving Paraphrase Detection with the Adversarial Paraphrasing Task
Animesh Nighojkar
John Licato
25
39
0
14 Jun 2021
Evaluating Various Tokenizers for Arabic Text Classification
Evaluating Various Tokenizers for Arabic Text Classification
Zaid Alyafeai
Maged S. Al-Shaibani
Mustafa Ghaleb
Irfan Ahmad
37
41
0
14 Jun 2021
An Empirical Survey of Data Augmentation for Limited Data Learning in
  NLP
An Empirical Survey of Data Augmentation for Limited Data Learning in NLP
Jiaao Chen
Derek Tam
Colin Raffel
Joey Tianyi Zhou
Diyi Yang
33
172
0
14 Jun 2021
GitTables: A Large-Scale Corpus of Relational Tables
GitTables: A Large-Scale Corpus of Relational Tables
Madelon Hulsebos
cCaugatay Demiralp
Paul T. Groth
LMTD
26
83
0
14 Jun 2021
Automatic Document Sketching: Generating Drafts from Analogous Texts
Automatic Document Sketching: Generating Drafts from Analogous Texts
Zeqiu Wu
Michel Galley
Chris Brockett
Yizhe Zhang
Bill Dolan
52
5
0
14 Jun 2021
Pre-Trained Models: Past, Present and Future
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
58
816
0
14 Jun 2021
Can Transformer Language Models Predict Psychometric Properties?
Can Transformer Language Models Predict Psychometric Properties?
Antonio Laverghetta
Animesh Nighojkar
Jamshidbek Mirzakhalov
John Licato
LM&MA
38
14
0
12 Jun 2021
Prompting Contrastive Explanations for Commonsense Reasoning Tasks
Prompting Contrastive Explanations for Commonsense Reasoning Tasks
Bhargavi Paranjape
Julian Michael
Marjan Ghazvininejad
Luke Zettlemoyer
Hannaneh Hajishirzi
ReLM
LRM
22
66
0
12 Jun 2021
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word
  Alignment
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment
Zewen Chi
Li Dong
Bo Zheng
Shaohan Huang
Xian-Ling Mao
Heyan Huang
Furu Wei
45
67
0
11 Jun 2021
FedNLP: An interpretable NLP System to Decode Federal Reserve
  Communications
FedNLP: An interpretable NLP System to Decode Federal Reserve Communications
Jean Lee
Hoyoul Luis Youn
Nicholas Stevens
Josiah Poon
S. Han
24
10
0
11 Jun 2021
Generate, Annotate, and Learn: NLP with Synthetic Text
Generate, Annotate, and Learn: NLP with Synthetic Text
Xuanli He
Islam Nassar
J. Kiros
Gholamreza Haffari
Mohammad Norouzi
39
51
0
11 Jun 2021
Space-time Mixing Attention for Video Transformer
Space-time Mixing Attention for Video Transformer
Adrian Bulat
Juan-Manuel Perez-Rua
Swathikiran Sudhakaran
Brais Martínez
Georgios Tzimiropoulos
ViT
36
124
0
10 Jun 2021
Scaling Vision with Sparse Mixture of Experts
Scaling Vision with Sparse Mixture of Experts
C. Riquelme
J. Puigcerver
Basil Mustafa
Maxim Neumann
Rodolphe Jenatton
André Susano Pinto
Daniel Keysers
N. Houlsby
MoE
29
579
0
10 Jun 2021
Investigating Alternatives to the Root Mean Square for Adaptive Gradient
  Methods
Investigating Alternatives to the Root Mean Square for Adaptive Gradient Methods
Brett Daley
Chris Amato
ODL
40
0
0
10 Jun 2021
Do Transformers Really Perform Bad for Graph Representation?
Do Transformers Really Perform Bad for Graph Representation?
Chengxuan Ying
Tianle Cai
Shengjie Luo
Shuxin Zheng
Guolin Ke
Di He
Yanming Shen
Tie-Yan Liu
GNN
48
435
0
09 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
49
1,172
0
09 Jun 2021
Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Rabeeh Karimi Mahabadi
James Henderson
Sebastian Ruder
MoE
67
469
0
08 Jun 2021
TIMEDIAL: Temporal Commonsense Reasoning in Dialog
TIMEDIAL: Temporal Commonsense Reasoning in Dialog
Lianhui Qin
Aditya Gupta
Shyam Upadhyay
Luheng He
Yejin Choi
Manaal Faruqui
LRM
31
65
0
08 Jun 2021
XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation
XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation
Subhabrata Mukherjee
Ahmed Hassan Awadallah
Jianfeng Gao
19
22
0
08 Jun 2021
A Survey of Transformers
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
53
1,088
0
08 Jun 2021
Meta-Learning to Compositionally Generalize
Meta-Learning to Compositionally Generalize
Henry Conklin
Bailin Wang
Kenny Smith
Ivan Titov
OOD
39
73
0
08 Jun 2021
Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question
  Answering
Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering
Aditya Gupta
Jiacheng Xu
Shyam Upadhyay
Diyi Yang
Manaal Faruqui
40
33
0
08 Jun 2021
Measuring and Improving BERT's Mathematical Abilities by Predicting the
  Order of Reasoning
Measuring and Improving BERT's Mathematical Abilities by Predicting the Order of Reasoning
Piotr Pikekos
Henryk Michalewski
Mateusz Malinowski
32
28
0
07 Jun 2021
Efficient Training of Visual Transformers with Small Datasets
Efficient Training of Visual Transformers with Small Datasets
Yahui Liu
E. Sangineto
Wei Bi
N. Sebe
Bruno Lepri
Marco De Nadai
ViT
36
167
0
07 Jun 2021
A Comprehensive Assessment of Dialog Evaluation Metrics
A Comprehensive Assessment of Dialog Evaluation Metrics
Yi-Ting Yeh
M. Eskénazi
Shikib Mehri
36
105
0
07 Jun 2021
BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue
  Modeling
BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling
Zhaojiang Lin
Andrea Madotto
Genta Indra Winata
Peng Xu
Feijun Jiang
Yuxiang Hu
Chen Shi
Pascale Fung
29
61
0
05 Jun 2021
Layered gradient accumulation and modular pipeline parallelism: fast and
  efficient training of large language models
Layered gradient accumulation and modular pipeline parallelism: fast and efficient training of large language models
J. Lamy-Poirier
MoE
29
8
0
04 Jun 2021
ERNIE-Tiny : A Progressive Distillation Framework for Pretrained
  Transformer Compression
ERNIE-Tiny : A Progressive Distillation Framework for Pretrained Transformer Compression
Weiyue Su
Xuyi Chen
Shi Feng
Jiaxiang Liu
Weixin Liu
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
34
13
0
04 Jun 2021
The Case for Translation-Invariant Self-Attention in Transformer-Based
  Language Models
The Case for Translation-Invariant Self-Attention in Transformer-Based Language Models
Ulme Wennberg
G. Henter
MILM
35
21
0
03 Jun 2021
Defending Against Backdoor Attacks in Natural Language Generation
Defending Against Backdoor Attacks in Natural Language Generation
Xiaofei Sun
Xiaoya Li
Yuxian Meng
Xiang Ao
Fei Wu
Jiwei Li
Tianwei Zhang
AAML
SILM
31
47
0
03 Jun 2021
Reordering Examples Helps during Priming-based Few-Shot Learning
Reordering Examples Helps during Priming-based Few-Shot Learning
Sawan Kumar
Partha P. Talukdar
20
58
0
03 Jun 2021
Can Generative Pre-trained Language Models Serve as Knowledge Bases for
  Closed-book QA?
Can Generative Pre-trained Language Models Serve as Knowledge Bases for Closed-book QA?
Cunxiang Wang
Pai Liu
Yue Zhang
RALM
42
80
0
03 Jun 2021
Previous
123...170171172...178179180
Next