ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXivPDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 8,824 papers shown
Title
ForecastQA: A Question Answering Challenge for Event Forecasting with
  Temporal Text Data
ForecastQA: A Question Answering Challenge for Event Forecasting with Temporal Text Data
Woojeong Jin
Rahul Khanna
Suji Kim
Dong-Ho Lee
Fred Morstatter
Aram Galstyan
Xiang Ren
AI4TS
11
36
0
02 May 2020
UnifiedQA: Crossing Format Boundaries With a Single QA System
UnifiedQA: Crossing Format Boundaries With a Single QA System
Daniel Khashabi
Sewon Min
Tushar Khot
Ashish Sabharwal
Oyvind Tafjord
Peter Clark
Hannaneh Hajishirzi
35
720
0
02 May 2020
Probing Contextual Language Models for Common Ground with Visual
  Representations
Probing Contextual Language Models for Common Ground with Visual Representations
Gabriel Ilharco
Rowan Zellers
Ali Farhadi
Hannaneh Hajishirzi
30
14
0
01 May 2020
POINTER: Constrained Progressive Text Generation via Insertion-based
  Generative Pre-training
POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training
Yizhe Zhang
Guoyin Wang
Chunyuan Li
Zhe Gan
Chris Brockett
Bill Dolan
34
30
0
01 May 2020
Beneath the Tip of the Iceberg: Current Challenges and New Directions in
  Sentiment Analysis Research
Beneath the Tip of the Iceberg: Current Challenges and New Directions in Sentiment Analysis Research
Soujanya Poria
Devamanyu Hazarika
Navonil Majumder
Rada Mihalcea
42
207
0
01 May 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation
  Pre-training
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
46
493
0
01 May 2020
Recipes for Adapting Pre-trained Monolingual and Multilingual Models to
  Machine Translation
Recipes for Adapting Pre-trained Monolingual and Multilingual Models to Machine Translation
Asa Cooper Stickland
Xian Li
Marjan Ghazvininejad
36
44
0
30 Apr 2020
Mind Your Inflections! Improving NLP for Non-Standard Englishes with
  Base-Inflection Encoding
Mind Your Inflections! Improving NLP for Non-Standard Englishes with Base-Inflection Encoding
Samson Tan
Chenyu You
L. Varshney
Min-Yen Kan
17
34
0
30 Apr 2020
Pre-training Is (Almost) All You Need: An Application to Commonsense
  Reasoning
Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning
Alexandre Tamborrino
Nicola Pellicanò
B. Pannier
Pascal Voitot
Louise Naudin
LRM
22
62
0
29 Apr 2020
Multi-Task Learning for Dense Prediction Tasks: A Survey
Multi-Task Learning for Dense Prediction Tasks: A Survey
Simon Vandenhende
Stamatios Georgoulis
Wouter Van Gansbeke
Marc Proesmans
Dengxin Dai
Luc Van Gool
CVBM
29
72
0
28 Apr 2020
Syntactic Data Augmentation Increases Robustness to Inference Heuristics
Syntactic Data Augmentation Increases Robustness to Inference Heuristics
Junghyun Min
R. Thomas McCoy
Dipanjan Das
Emily Pitler
Tal Linzen
30
175
0
24 Apr 2020
Generative Data Augmentation for Commonsense Reasoning
Generative Data Augmentation for Commonsense Reasoning
Yiben Yang
Chaitanya Malaviya
Jared Fernandez
Swabha Swayamdipta
Ronan Le Bras
Ji-ping Wang
Chandra Bhagavatula
Yejin Choi
Doug Downey
LRM
27
91
0
24 Apr 2020
A Review of Winograd Schema Challenge Datasets and Approaches
A Review of Winograd Schema Challenge Datasets and Approaches
Vid Kocijan
Thomas Lukasiewicz
E. Davis
G. Marcus
L. Morgenstern
25
43
0
23 Apr 2020
MT-Clinical BERT: Scaling Clinical Information Extraction with Multitask
  Learning
MT-Clinical BERT: Scaling Clinical Information Extraction with Multitask Learning
Andriy Mulyar
Bridget T. McInnes
16
52
0
21 Apr 2020
Fine-tuning Multi-hop Question Answering with Hierarchical Graph Network
Guanming Xiong
26
0
0
20 Apr 2020
The Cost of Training NLP Models: A Concise Overview
The Cost of Training NLP Models: A Concise Overview
Or Sharir
Barak Peleg
Y. Shoham
40
210
0
19 Apr 2020
Can You Put it All Together: Evaluating Conversational Agents' Ability
  to Blend Skills
Can You Put it All Together: Evaluating Conversational Agents' Ability to Blend Skills
Eric Michael Smith
Mary Williamson
Kurt Shuster
Jason Weston
Y-Lan Boureau
19
221
0
17 Apr 2020
The Right Tool for the Job: Matching Model and Instance Complexities
The Right Tool for the Job: Matching Model and Instance Complexities
Roy Schwartz
Gabriel Stanovsky
Swabha Swayamdipta
Jesse Dodge
Noah A. Smith
38
167
0
16 Apr 2020
CrisisBench: Benchmarking Crisis-related Social Media Datasets for
  Humanitarian Information Processing
CrisisBench: Benchmarking Crisis-related Social Media Datasets for Humanitarian Information Processing
Firoj Alam
Hassan Sajjad
Muhammad Imran
Ferda Ofli
18
14
0
14 Apr 2020
CLUE: A Chinese Language Understanding Evaluation Benchmark
CLUE: A Chinese Language Understanding Evaluation Benchmark
Liang Xu
Hai Hu
Xuanwei Zhang
Lu Li
Chenjie Cao
...
Cong Yue
Xinrui Zhang
Zhen-Yi Yang
Kyle Richardson
Zhenzhong Lan
ELM
45
377
0
13 Apr 2020
Unsupervised Commonsense Question Answering with Self-Talk
Unsupervised Commonsense Question Answering with Self-Talk
Vered Shwartz
Peter West
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
ReLM
SSL
AI4MH
LRM
19
257
0
11 Apr 2020
Longformer: The Long-Document Transformer
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
28
3,929
0
10 Apr 2020
Exploring Versatile Generative Language Model Via Parameter-Efficient
  Transfer Learning
Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning
Zhaojiang Lin
Andrea Madotto
Pascale Fung
34
155
0
08 Apr 2020
Downstream Model Design of Pre-trained Language Model for Relation
  Extraction Task
Downstream Model Design of Pre-trained Language Model for Relation Extraction Task
Cheng-rong Li
Ye Tian
19
36
0
08 Apr 2020
Byte Pair Encoding is Suboptimal for Language Model Pretraining
Byte Pair Encoding is Suboptimal for Language Model Pretraining
Kaj Bostrom
Greg Durrett
19
200
0
07 Apr 2020
FastBERT: a Self-distilling BERT with Adaptive Inference Time
FastBERT: a Self-distilling BERT with Adaptive Inference Time
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Haotang Deng
Qi Ju
57
354
0
05 Apr 2020
Unsupervised Domain Clusters in Pretrained Language Models
Unsupervised Domain Clusters in Pretrained Language Models
Roee Aharoni
Yoav Goldberg
37
244
0
05 Apr 2020
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
Chunyuan Li
Xiang Gao
Yuan Li
Baolin Peng
Xiujun Li
Yizhe Zhang
Jianfeng Gao
SSL
DRL
32
181
0
05 Apr 2020
A Hierarchical Network for Abstractive Meeting Summarization with
  Cross-Domain Pretraining
A Hierarchical Network for Abstractive Meeting Summarization with Cross-Domain Pretraining
Chenguang Zhu
Ruochen Xu
Michael Zeng
Xuedong Huang
BDL
AI4TS
26
18
0
04 Apr 2020
TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance
  Generation
TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation
Shaojie Jiang
Thomas Wolf
Christof Monz
Maarten de Rijke
33
11
0
26 Mar 2020
A Survey of Deep Learning for Scientific Discovery
A Survey of Deep Learning for Scientific Discovery
M. Raghu
Erica Schmidt
OOD
AI4CE
40
120
0
26 Mar 2020
Pre-trained Models for Natural Language Processing: A Survey
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
243
1,452
0
18 Mar 2020
A Survey on Contextual Embeddings
A Survey on Contextual Embeddings
Qi Liu
Matt J. Kusner
Phil Blunsom
225
146
0
16 Mar 2020
TRANS-BLSTM: Transformer with Bidirectional LSTM for Language
  Understanding
TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding
Zhiheng Huang
Peng Xu
Davis Liang
Ajay K. Mishra
Bing Xiang
15
31
0
16 Mar 2020
Learning to Encode Position for Transformer with Continuous Dynamical
  Model
Learning to Encode Position for Transformer with Continuous Dynamical Model
Xuanqing Liu
Hsiang-Fu Yu
Inderjit Dhillon
Cho-Jui Hsieh
16
107
0
13 Mar 2020
ReZero is All You Need: Fast Convergence at Large Depth
ReZero is All You Need: Fast Convergence at Large Depth
Thomas C. Bachlechner
Bodhisattwa Prasad Majumder
H. H. Mao
G. Cottrell
Julian McAuley
AI4CE
21
276
0
10 Mar 2020
Adaptive Name Entity Recognition under Highly Unbalanced Data
Adaptive Name Entity Recognition under Highly Unbalanced Data
Thong Nguyen
Duy Nguyen
Pramod Rao
6
9
0
10 Mar 2020
CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language
  Model
CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language Model
Liang Xu
Xuanwei Zhang
Qianqian Dong
SSL
16
70
0
03 Mar 2020
Med7: a transferable clinical natural language processing model for
  electronic health records
Med7: a transferable clinical natural language processing model for electronic health records
Andrey Kormilitzin
N. Vaci
Qiang Liu
A. Nevado-Holgado
22
115
0
03 Mar 2020
AraBERT: Transformer-based Model for Arabic Language Understanding
AraBERT: Transformer-based Model for Arabic Language Understanding
Wissam Antoun
Fady Baly
Hazem M. Hajj
46
941
0
28 Feb 2020
UniLMv2: Pseudo-Masked Language Models for Unified Language Model
  Pre-Training
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
Hangbo Bao
Li Dong
Furu Wei
Wenhui Wang
Nan Yang
...
Yu-Chiang Frank Wang
Songhao Piao
Jianfeng Gao
Ming Zhou
H. Hon
AI4CE
38
392
0
28 Feb 2020
On Feature Normalization and Data Augmentation
On Feature Normalization and Data Augmentation
Boyi Li
Felix Wu
Ser-Nam Lim
Serge J. Belongie
Kilian Q. Weinberger
21
134
0
25 Feb 2020
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression
  of Pre-Trained Transformers
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
47
1,209
0
25 Feb 2020
Training Question Answering Models From Synthetic Data
Training Question Answering Models From Synthetic Data
Raul Puri
Ryan Spring
M. Patwary
M. Shoeybi
Bryan Catanzaro
ELM
24
159
0
22 Feb 2020
LAMBERT: Layout-Aware (Language) Modeling for information extraction
LAMBERT: Layout-Aware (Language) Modeling for information extraction
Lukasz Garncarek
Rafal Powalski
Tomasz Stanislawek
Bartosz Topolski
Piotr Halama
M. Turski
Filip Graliñski
8
87
0
19 Feb 2020
GLU Variants Improve Transformer
GLU Variants Improve Transformer
Noam M. Shazeer
75
931
0
12 Feb 2020
REALM: Retrieval-Augmented Language Model Pre-Training
REALM: Retrieval-Augmented Language Model Pre-Training
Kelvin Guu
Kenton Lee
Zora Tung
Panupong Pasupat
Ming-Wei Chang
RALM
45
2,000
0
10 Feb 2020
Semi-Supervised Class Discovery
Semi-Supervised Class Discovery
Jeremy Nixon
J. Liu
David Berthelot
20
2
0
10 Feb 2020
Momentum Improves Normalized SGD
Momentum Improves Normalized SGD
Ashok Cutkosky
Harsh Mehta
ODL
18
118
0
09 Feb 2020
Segmented Graph-Bert for Graph Instance Modeling
Segmented Graph-Bert for Graph Instance Modeling
Jiawei Zhang
SSeg
25
6
0
09 Feb 2020
Previous
123...175176177
Next