ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,870 papers shown
Title
SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising
SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising
K. Xuan
Yongbo Wang
Yongliang Wang
Zujie Wen
Yang Dong
VLM
76
54
0
17 May 2021
Doc2Dict: Information Extraction as Text Generation
Doc2Dict: Information Extraction as Text Generation
Benjamin Townsend
Eamon Ito-Fisher
Lily Zhang
Madison May
56
7
0
16 May 2021
EASE: Extractive-Abstractive Summarization with Explanations
EASE: Extractive-Abstractive Summarization with Explanations
Haoran Li
Arash Einolghozati
Srini Iyer
Bhargavi Paranjape
Yashar Mehdad
Sonal Gupta
Marjan Ghazvininejad
48
10
0
14 May 2021
A cost-benefit analysis of cross-lingual transfer methods
A cost-benefit analysis of cross-lingual transfer methods
G. Rosa
L. Bonifacio
Leandro Rodrigues de Souza
R. Lotufo
Rodrigo Nogueira
50
12
0
14 May 2021
RetGen: A Joint framework for Retrieval and Grounded Text Generation
  Modeling
RetGen: A Joint framework for Retrieval and Grounded Text Generation Modeling
Yizhe Zhang
Siqi Sun
Xiang Gao
Yuwei Fang
Chris Brockett
Michel Galley
Jianfeng Gao
Bill Dolan
RALM
106
34
0
14 May 2021
Retrieval-Free Knowledge-Grounded Dialogue Response Generation with
  Adapters
Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters
Yan Xu
Etsuko Ishii
Samuel Cahyawijaya
Zihan Liu
Genta Indra Winata
Andrea Madotto
Dan Su
Pascale Fung
RALM
55
45
0
13 May 2021
Are Larger Pretrained Language Models Uniformly Better? Comparing
  Performance at the Instance Level
Are Larger Pretrained Language Models Uniformly Better? Comparing Performance at the Instance Level
Ruiqi Zhong
Dhruba Ghosh
Dan Klein
Jacob Steinhardt
91
36
0
13 May 2021
Improving Code Autocompletion with Transfer Learning
Improving Code Autocompletion with Transfer Learning
Wenjie Zhou
Seohyun Kim
V. Murali
Gareth Ari Aye
112
15
0
12 May 2021
Go Beyond Plain Fine-tuning: Improving Pretrained Models for Social
  Commonsense
Go Beyond Plain Fine-tuning: Improving Pretrained Models for Social Commonsense
Ting-Yun Chang
Yang Liu
Karthik Gopalakrishnan
Behnam Hedayatnia
Pei Zhou
Dilek Z. Hakkani-Tür
ReLMVLMAI4MHLRM
18
1
0
12 May 2021
How Reliable are Model Diagnostics?
How Reliable are Model Diagnostics?
V. Aribandi
Yi Tay
Donald Metzler
103
19
0
12 May 2021
Incorporating Commonsense Knowledge Graph in Pretrained Models for
  Social Commonsense Tasks
Incorporating Commonsense Knowledge Graph in Pretrained Models for Social Commonsense Tasks
Ting-Yun Chang
Yang Liu
Karthik Gopalakrishnan
Behnam Hedayatnia
Pei Zhou
Dilek Z. Hakkani-Tür
ReLMAI4MHLRM
84
37
0
12 May 2021
Addressing "Documentation Debt" in Machine Learning Research: A
  Retrospective Datasheet for BookCorpus
Addressing "Documentation Debt" in Machine Learning Research: A Retrospective Datasheet for BookCorpus
Jack Bandy
Nicholas Vincent
69
57
0
11 May 2021
EL-Attention: Memory Efficient Lossless Attention for Generation
EL-Attention: Memory Efficient Lossless Attention for Generation
Yu Yan
Jiusheng Chen
Weizhen Qi
Nikhil Bhendawade
Yeyun Gong
Nan Duan
Ruofei Zhang
VLM
68
6
0
11 May 2021
T-EMDE: Sketching-based global similarity for cross-modal retrieval
T-EMDE: Sketching-based global similarity for cross-modal retrieval
Barbara Rychalska
Mikolaj Wieczorek
Jacek Dąbrowski
59
0
0
10 May 2021
Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue State
  Tracking
Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue State Tracking
Zhaojiang Lin
Bing-Quan Liu
Seungwhan Moon
Paul A. Crook
Zhenpeng Zhou
Zhiguang Wang
Zhou Yu
Andrea Madotto
Eunjoon Cho
R. Subba
73
91
0
10 May 2021
MS MARCO: Benchmarking Ranking Models in the Large-Data Regime
MS MARCO: Benchmarking Ranking Models in the Large-Data Regime
Nick Craswell
Bhaskar Mitra
Emine Yilmaz
Daniel Fernando Campos
Jimmy J. Lin
ALM
83
66
0
09 May 2021
Continual Mixed-Language Pre-Training for Extremely Low-Resource Neural
  Machine Translation
Continual Mixed-Language Pre-Training for Extremely Low-Resource Neural Machine Translation
Zihan Liu
Genta Indra Winata
Pascale Fung
VLMCLL
96
54
0
09 May 2021
Which transformer architecture fits my data? A vocabulary bottleneck in
  self-attention
Which transformer architecture fits my data? A vocabulary bottleneck in self-attention
Noam Wies
Yoav Levine
Daniel Jannai
Amnon Shashua
92
20
0
09 May 2021
Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents
Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents
Chaojun Xiao
Xueyu Hu
Zhiyuan Liu
Cunchao Tu
Maosong Sun
AILawELM
104
245
0
09 May 2021
FNet: Mixing Tokens with Fourier Transforms
FNet: Mixing Tokens with Fourier Transforms
James Lee-Thorp
Joshua Ainslie
Ilya Eckstein
Santiago Ontanon
134
536
0
09 May 2021
Long-Span Summarization via Local Attention and Content Selection
Long-Span Summarization via Local Attention and Content Selection
Potsawee Manakul
Mark Gales
87
42
0
08 May 2021
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP:
  The Role of Sample Size and Dimensionality
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality
Adithya Ganesan
Matthew Matero
Aravind Reddy Ravula
Huy-Hien Vu
H. Andrew Schwartz
90
35
0
07 May 2021
Are Pre-trained Convolutions Better than Pre-trained Transformers?
Are Pre-trained Convolutions Better than Pre-trained Transformers?
Yi Tay
Mostafa Dehghani
J. Gupta
Dara Bahri
V. Aribandi
Zhen Qin
Donald Metzler
AI4CE
77
49
0
07 May 2021
A Dataset of Information-Seeking Questions and Answers Anchored in
  Research Papers
A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers
Pradeep Dasigi
Kyle Lo
Iz Beltagy
Arman Cohan
Noah A. Smith
Matt Gardner
RALM
134
311
0
07 May 2021
Learning to Perturb Word Embeddings for Out-of-distribution QA
Learning to Perturb Word Embeddings for Out-of-distribution QA
Seanie Lee
Minki Kang
Juho Lee
Sung Ju Hwang
OOD
108
19
0
06 May 2021
A Novel Estimator of Mutual Information for Learning to Disentangle
  Textual Representations
A Novel Estimator of Mutual Information for Learning to Disentangle Textual Representations
Pierre Colombo
Chloé Clavel
Pablo Piantanida
AAMLDRL
183
51
0
06 May 2021
GraphFormers: GNN-nested Transformers for Representation Learning on
  Textual Graph
GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph
Junhan Yang
Zheng Liu
Shitao Xiao
Chaozhuo Li
Defu Lian
Sanjay Agrawal
Amit Singh
Guangzhong Sun
Xing Xie
AI4CE
86
160
0
06 May 2021
Towards General Natural Language Understanding with Probabilistic
  Worldbuilding
Towards General Natural Language Understanding with Probabilistic Worldbuilding
Abulhair Saparov
Tom Michael Mitchell
98
6
0
06 May 2021
Rethinking Search: Making Domain Experts out of Dilettantes
Rethinking Search: Making Domain Experts out of Dilettantes
Donald Metzler
Yi Tay
Dara Bahri
Marc Najork
LRM
103
47
0
05 May 2021
HerBERT: Efficiently Pretrained Transformer-based Language Model for
  Polish
HerBERT: Efficiently Pretrained Transformer-based Language Model for Polish
Robert Mroczkowski
Piotr Rybak
Alina Wróblewska
Ireneusz Gawlik
86
85
0
04 May 2021
Goldilocks: Just-Right Tuning of BERT for Technology-Assisted Review
Goldilocks: Just-Right Tuning of BERT for Technology-Assisted Review
Eugene Yang
Sean MacAvaney
D. Lewis
O. Frieder
126
29
0
03 May 2021
Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark
Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark
Nouha Dziri
Hannah Rashkin
Tal Linzen
David Reitter
ALM
317
83
0
30 Apr 2021
Entailment as Few-Shot Learner
Entailment as Few-Shot Learner
Sinong Wang
Han Fang
Madian Khabsa
Hanzi Mao
Hao Ma
100
184
0
29 Apr 2021
A First Look: Towards Explainable TextVQA Models via Visual and Textual
  Explanations
A First Look: Towards Explainable TextVQA Models via Visual and Textual Explanations
Varun Nagaraj Rao
Xingjian Zhen
K. Hovsepian
Mingwei Shen
97
19
0
29 Apr 2021
MAGMA: An Optimization Framework for Mapping Multiple DNNs on Multiple
  Accelerator Cores
MAGMA: An Optimization Framework for Mapping Multiple DNNs on Multiple Accelerator Cores
Sheng-Chun Kao
T. Krishna
120
52
0
28 Apr 2021
Morph Call: Probing Morphosyntactic Content of Multilingual Transformers
Morph Call: Probing Morphosyntactic Content of Multilingual Transformers
Vladislav Mikhailov
O. Serikov
Ekaterina Artemova
82
9
0
26 Apr 2021
Focused Attention Improves Document-Grounded Generation
Focused Attention Improves Document-Grounded Generation
Shrimai Prabhumoye
Kazuma Hashimoto
Yingbo Zhou
A. Black
Ruslan Salakhutdinov
225
41
0
26 Apr 2021
What Makes a Message Persuasive? Identifying Adaptations Towards
  Persuasiveness in Nine Exploratory Case Studies
What Makes a Message Persuasive? Identifying Adaptations Towards Persuasiveness in Nine Exploratory Case Studies
Sebastian Duerr
Krystian Teodor Lange
P. Gloor
23
2
0
26 Apr 2021
PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language
  Models with Auto-parallel Computation
PanGu-ααα: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Wei Zeng
Xiaozhe Ren
Teng Su
Hui Wang
Yi-Lun Liao
...
Gaojun Fan
Yaowei Wang
Xuefeng Jin
Qun Liu
Yonghong Tian
ALMMoEAI4CE
80
214
0
26 Apr 2021
XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis
  and Beyond
XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis and Beyond
Francesco Barbieri
Luis Espinosa Anke
Jose Camacho-Collados
253
227
0
25 Apr 2021
Extract then Distill: Efficient and Effective Task-Agnostic BERT
  Distillation
Extract then Distill: Efficient and Effective Task-Agnostic BERT Distillation
Cheng Chen
Yichun Yin
Lifeng Shang
Zhi Wang
Xin Jiang
Xiao Chen
Qun Liu
FedML
75
7
0
24 Apr 2021
A Survey of Modern Deep Learning based Object Detection Models
A Survey of Modern Deep Learning based Object Detection Models
Syed Sahil Abbas Zaidi
M. S. Ansari
Asra Aslam
N. Kanwal
M. Asghar
Brian Lee
VLMObjD
155
760
0
24 Apr 2021
Generating abstractive summaries of Lithuanian news articles using a
  transformer model
Generating abstractive summaries of Lithuanian news articles using a transformer model
Lukas Stankevicius
M. Lukoševičius
50
3
0
23 Apr 2021
Towards Accurate Text-based Image Captioning with Content Diversity
  Exploration
Towards Accurate Text-based Image Captioning with Content Diversity Exploration
Guanghui Xu
Shuaicheng Niu
Mingkui Tan
Yucheng Luo
Qing Du
Qi Wu
DiffM
84
58
0
23 Apr 2021
Automated News Summarization Using Transformers
Automated News Summarization Using Transformers
Anushka Gupta
Diksha Chugh
Anjum
R. Katarya
42
65
0
23 Apr 2021
Transfer training from smaller language model
Transfer training from smaller language model
Han Zhang
54
0
0
23 Apr 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Huayu Chen
Boqing Gong
ViT
368
594
0
22 Apr 2021
Provable Limitations of Acquiring Meaning from Ungrounded Form: What
  Will Future Language Models Understand?
Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?
William Merrill
Yoav Goldberg
Roy Schwartz
Noah A. Smith
103
69
0
22 Apr 2021
Carbon Emissions and Large Neural Network Training
Carbon Emissions and Large Neural Network Training
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
345
688
0
21 Apr 2021
Efficient Retrieval Optimized Multi-task Learning
Efficient Retrieval Optimized Multi-task Learning
He Fun
S. Gandhi
Sujith Ravi
RALM
74
6
0
20 Apr 2021
Previous
123...182183184...196197198
Next