ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

43 / 9,843 papers shown
Title
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
668
4,935
0
23 Jan 2020
Multilingual Denoising Pre-training for Neural Machine Translation
Multilingual Denoising Pre-training for Neural Machine Translation
Yinhan Liu
Jiatao Gu
Naman Goyal
Xian Li
Sergey Edunov
Marjan Ghazvininejad
M. Lewis
Luke Zettlemoyer
AI4CEAIMat
128
1,817
0
22 Jan 2020
FixMatch: Simplifying Semi-Supervised Learning with Consistency and
  Confidence
FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
Kihyuk Sohn
David Berthelot
Chun-Liang Li
Zizhao Zhang
Nicholas Carlini
E. D. Cubuk
Alexey Kurakin
Han Zhang
Colin Raffel
AAML
165
3,595
0
21 Jan 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural
  Language Inference
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
369
1,627
0
21 Jan 2020
A multimodal deep learning approach for named entity recognition from
  social media
A multimodal deep learning approach for named entity recognition from social media
M. Asgari-Chenaghlu
M. Feizi-Derakhshi
Leili Farzinvash
M. Balafar
C. Motamed
60
29
0
19 Jan 2020
RobBERT: a Dutch RoBERTa-based Language Model
RobBERT: a Dutch RoBERTa-based Language Model
Pieter Delobelle
Thomas Winters
Bettina Berendt
86
240
0
17 Jan 2020
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence
  Pre-training
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training
Weizhen Qi
Yu Yan
Yeyun Gong
Dayiheng Liu
Nan Duan
Jiusheng Chen
Ruofei Zhang
Ming Zhou
AI4TS
140
450
0
13 Jan 2020
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive
  Summarization
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
Jingqing Zhang
Yao-Min Zhao
Mohammad Saleh
Peter J. Liu
RALM3DGS
307
2,057
0
18 Dec 2019
Multilingual is not enough: BERT for Finnish
Multilingual is not enough: BERT for Finnish
Antti Virtanen
Jenna Kanerva
Rami Ilo
Jouni Luoma
Juhani Luotolahti
T. Salakoski
Filip Ginter
S. Pyysalo
88
281
0
15 Dec 2019
WaLDORf: Wasteless Language-model Distillation On Reading-comprehension
WaLDORf: Wasteless Language-model Distillation On Reading-comprehension
J. Tian
A. Kreuzer
Pai-Hung Chen
Hans-Martin Will
VLM
60
3
0
13 Dec 2019
Extending Machine Language Models toward Human-Level Language
  Understanding
Extending Machine Language Models toward Human-Level Language Understanding
James L. McClelland
Felix Hill
Maja R. Rudolph
Jason Baldridge
Hinrich Schütze
LRM
78
35
0
12 Dec 2019
FlauBERT: Unsupervised Language Model Pre-training for French
FlauBERT: Unsupervised Language Model Pre-training for French
Hang Le
Loïc Vial
Jibril Frej
Vincent Segonne
Maximin Coavoux
Benjamin Lecouteux
A. Allauzen
Benoît Crabbé
Laurent Besacier
D. Schwab
AI4CE
111
401
0
11 Dec 2019
Zero-shot Text Classification With Generative Language Models
Zero-shot Text Classification With Generative Language Models
Raul Puri
Bryan Catanzaro
VLM
81
106
0
10 Dec 2019
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art
  Baseline
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
Vishvak Murahari
Dhruv Batra
Devi Parikh
Abhishek Das
VLM
109
117
0
05 Dec 2019
12-in-1: Multi-Task Vision and Language Representation Learning
12-in-1: Multi-Task Vision and Language Representation Learning
Jiasen Lu
Vedanuj Goswami
Marcus Rohrbach
Devi Parikh
Stefan Lee
VLMObjD
122
481
0
05 Dec 2019
BLiMP: The Benchmark of Linguistic Minimal Pairs for English
BLiMP: The Benchmark of Linguistic Minimal Pairs for English
Alex Warstadt
Alicia Parrish
Haokun Liu
Anhad Mohananey
Wei Peng
Sheng-Fu Wang
Samuel R. Bowman
137
496
0
02 Dec 2019
What's Hidden in a Randomly Weighted Neural Network?
What's Hidden in a Randomly Weighted Neural Network?
Vivek Ramanujan
Mitchell Wortsman
Aniruddha Kembhavi
Ali Farhadi
Mohammad Rastegari
74
362
0
29 Nov 2019
Iterative Answer Prediction with Pointer-Augmented Multimodal
  Transformers for TextVQA
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
Ronghang Hu
Amanpreet Singh
Trevor Darrell
Marcus Rohrbach
91
197
0
14 Nov 2019
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language
  Representation
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation
Xiaozhi Wang
Tianyu Gao
Zhaocheng Zhu
Zhengyan Zhang
Zhiyuan Liu
Juan-Zi Li
Jian Tang
170
675
0
13 Nov 2019
CamemBERT: a Tasty French Language Model
CamemBERT: a Tasty French Language Model
Louis Martin
Benjamin Muller
Pedro Ortiz Suarez
Yoann Dupont
Laurent Romary
Eric Villemonte de la Clergerie
Djamé Seddah
Benoît Sagot
143
981
0
10 Nov 2019
INSET: Sentence Infilling with INter-SEntential Transformer
INSET: Sentence Infilling with INter-SEntential Transformer
Yichen Huang
Yizhe Zhang
Oussama Elachqar
Yu Cheng
50
1
0
10 Nov 2019
Learning to Few-Shot Learn Across Diverse Natural Language
  Classification Tasks
Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks
Trapit Bansal
Rishikesh Jha
Andrew McCallum
SSL
94
121
0
10 Nov 2019
The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded
  Conversational Agents
The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents
Kurt Shuster
Da Ju
Stephen Roller
Emily Dinan
Y-Lan Boureau
Jason Weston
102
82
0
09 Nov 2019
Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity
Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity
Nina Poerner
Ulli Waltinger
Hinrich Schütze
AI4TS
176
20
0
09 Nov 2019
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language
  Models through Principled Regularized Optimization
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
T. Zhao
135
563
0
08 Nov 2019
Contrastive Multi-document Question Generation
Contrastive Multi-document Question Generation
W. Cho
Yizhe Zhang
Sudha Rao
Asli Celikyilmaz
Chenyan Xiong
Jianfeng Gao
Mengdi Wang
Bill Dolan
SyDa
112
28
0
08 Nov 2019
BERTs of a feather do not generalize together: Large variability in
  generalization across models with similar test set performance
BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance
R. Thomas McCoy
Junghyun Min
Tal Linzen
137
151
0
07 Nov 2019
Unsupervised Cross-lingual Representation Learning at Scale
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
230
6,614
0
05 Nov 2019
DialoGPT: Large-Scale Generative Pre-training for Conversational
  Response Generation
DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation
Yizhe Zhang
Siqi Sun
Michel Galley
Yen-Chun Chen
Chris Brockett
Xiang Gao
Jianfeng Gao
Jingjing Liu
W. Dolan
VLM
233
1,528
0
01 Nov 2019
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Guillaume Wenzek
Marie-Anne Lachaux
Alexis Conneau
Vishrav Chaudhary
Francisco Guzmán
Armand Joulin
Edouard Grave
124
658
0
01 Nov 2019
Multi-Stage Document Ranking with BERT
Multi-Stage Document Ranking with BERT
Rodrigo Nogueira
Wei Yang
Kyunghyun Cho
Jimmy J. Lin
91
398
0
31 Oct 2019
Discourse-Aware Neural Extractive Text Summarization
Discourse-Aware Neural Extractive Text Summarization
Jiacheng Xu
Zhe Gan
Yu Cheng
Jingjing Liu
BDL
150
283
0
30 Oct 2019
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language
  Generation, Translation, and Comprehension
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
M. Lewis
Yinhan Liu
Naman Goyal
Marjan Ghazvininejad
Abdel-rahman Mohamed
Omer Levy
Veselin Stoyanov
Luke Zettlemoyer
AIMatVLM
268
10,907
0
29 Oct 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSLAIMat
415
6,479
0
26 Sep 2019
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Chen Zhu
Yu Cheng
Zhe Gan
S. Sun
Tom Goldstein
Jingjing Liu
AAML
296
443
0
25 Sep 2019
Portuguese Named Entity Recognition using BERT-CRF
Portuguese Named Entity Recognition using BERT-CRF
Fábio Souza
Rodrigo Nogueira
R. Lotufo
73
258
0
23 Sep 2019
TinyBERT: Distilling BERT for Natural Language Understanding
TinyBERT: Distilling BERT for Natural Language Understanding
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
126
1,881
0
23 Sep 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
363
1,925
0
17 Sep 2019
Taming Momentum in a Distributed Asynchronous Environment
Taming Momentum in a Distributed Asynchronous Environment
Ido Hakimi
Saar Barkai
Moshe Gabel
Assaf Schuster
93
23
0
26 Jul 2019
Contextual Word Representations: A Contextual Introduction
Contextual Word Representations: A Contextual Introduction
Noah A. Smith
64
34
0
15 Feb 2019
Are All Layers Created Equal?
Are All Layers Created Equal?
Chiyuan Zhang
Samy Bengio
Y. Singer
111
140
0
06 Feb 2019
Neural Abstractive Text Summarization with Sequence-to-Sequence Models
Neural Abstractive Text Summarization with Sequence-to-Sequence Models
Tian Shi
Yaser Keneshloo
Naren Ramakrishnan
Chandan K. Reddy
127
234
0
05 Dec 2018
Deep Learning for Genomics: A Concise Overview
Deep Learning for Genomics: A Concise Overview
Tianwei Yue
Yuanxin Wang
Longxiang Zhang
Chunming Gu
Haohan Wang
Wenping Wang
Qi Lyu
Yujie Dun
AILawVLMBDL
86
91
0
02 Feb 2018
Previous
123...195196197