ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,870 papers shown
Title
WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen
  Language Models
WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models
Heting Gao
Junrui Ni
Kaizhi Qian
Yang Zhang
Shiyu Chang
M. Hasegawa-Johnson
VLM
175
31
0
29 Mar 2022
LinkBERT: Pretraining Language Models with Document Links
LinkBERT: Pretraining Language Models with Document Links
Michihiro Yasunaga
J. Leskovec
Percy Liang
KELM
108
361
0
29 Mar 2022
Evaluating Prompts Across Multiple Choice Tasks In a Zero-Shot Setting
Evaluating Prompts Across Multiple Choice Tasks In a Zero-Shot Setting
Gabriel Orlanski
LRM
67
2
0
29 Mar 2022
Training Compute-Optimal Large Language Models
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
217
1,992
0
29 Mar 2022
Improving Source Separation by Explicitly Modeling Dependencies Between
  Sources
Improving Source Separation by Explicitly Modeling Dependencies Between Sources
Ethan Manilow
Curtis Hawthorne
Cheng-Zhi Anna Huang
Bryan Pardo
Jesse Engel
BDL
67
10
0
28 Mar 2022
A Well-Composed Text is Half Done! Composition Sampling for Diverse
  Conditional Generation
A Well-Composed Text is Half Done! Composition Sampling for Diverse Conditional Generation
Shashi Narayan
Gonccalo Simoes
Yao-Min Zhao
Joshua Maynez
Dipanjan Das
Michael Collins
Mirella Lapata
93
30
0
28 Mar 2022
LogicInference: A New Dataset for Teaching Logical Inference to seq2seq
  Models
LogicInference: A New Dataset for Teaching Logical Inference to seq2seq Models
Santiago Ontanon
Joshua Ainslie
Vaclav Cvicek
Zachary Kenneth Fisher
NAIReLMLRM
164
13
0
28 Mar 2022
HetuMoE: An Efficient Trillion-scale Mixture-of-Expert Distributed
  Training System
HetuMoE: An Efficient Trillion-scale Mixture-of-Expert Distributed Training System
Xiaonan Nie
Pinxue Zhao
Xupeng Miao
Tong Zhao
Tengjiao Wang
MoE
88
39
0
28 Mar 2022
ANNA: Enhanced Language Representation for Question Answering
ANNA: Enhanced Language Representation for Question Answering
Changwook Jun
Hansol Jang
Myoseop Sim
Hyun Kim
Jooyoung Choi
Kyungkoo Min
Kyunghoon Bae
73
8
0
28 Mar 2022
Example-based Hypernetworks for Out-of-Distribution Generalization
Example-based Hypernetworks for Out-of-Distribution Generalization
Tomer Volk
Eyal Ben-David
Ohad Amosy
Gal Chechik
Roi Reichart
OOD
93
20
0
27 Mar 2022
Lite Unified Modeling for Discriminative Reading Comprehension
Lite Unified Modeling for Discriminative Reading Comprehension
Yilin Zhao
Hai Zhao
Libin Shen
Yinggong Zhao
86
2
0
26 Mar 2022
Visual Abductive Reasoning
Visual Abductive Reasoning
Chen Liang
Wenguan Wang
Tianfei Zhou
Yi Yang
LRM
92
40
0
26 Mar 2022
CICERO: A Dataset for Contextualized Commonsense Inference in Dialogues
CICERO: A Dataset for Contextualized Commonsense Inference in Dialogues
Deepanway Ghosal
Siqi Shen
Navonil Majumder
Rada Mihalcea
Soujanya Poria
105
54
0
25 Mar 2022
CodeGen: An Open Large Language Model for Code with Multi-Turn Program
  Synthesis
CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
Erik Nijkamp
Bo Pang
Hiroaki Hayashi
Lifu Tu
Haiquan Wang
Yingbo Zhou
Silvio Savarese
Caiming Xiong
ELM
181
1,054
0
25 Mar 2022
Recommendation as Language Processing (RLP): A Unified Pretrain,
  Personalized Prompt & Predict Paradigm (P5)
Recommendation as Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5)
Shijie Geng
Shuchang Liu
Zuohui Fu
Yingqiang Ge
Yongfeng Zhang
VLMAI4TS
178
491
0
24 Mar 2022
Token Dropping for Efficient BERT Pretraining
Token Dropping for Efficient BERT Pretraining
Le Hou
Richard Yuanzhe Pang
Dinesh Manocha
Yuexin Wu
Xinying Song
Xiaodan Song
Denny Zhou
85
46
0
24 Mar 2022
Automatic Speech Recognition for Speech Assessment of Persian Preschool
  Children
Automatic Speech Recognition for Speech Assessment of Persian Preschool Children
Amirhossein Abaskohi
Fatemeh Mortazavi
Hadi Moradi
65
7
0
24 Mar 2022
Beyond Fixation: Dynamic Window Visual Transformer
Beyond Fixation: Dynamic Window Visual Transformer
Pengzhen Ren
Changlin Li
Guangrun Wang
Yun Xiao
Qing Du
Xiaodan Liang
Qing Du Xiaodan Liang Xiaojun Chang
ViT
101
36
0
24 Mar 2022
Pathways: Asynchronous Distributed Dataflow for ML
Pathways: Asynchronous Distributed Dataflow for ML
P. Barham
Aakanksha Chowdhery
J. Dean
Sanjay Ghemawat
Steven Hand
...
Parker Schuh
Ryan Sepassi
Laurent El Shafey
C. A. Thekkath
Yonghui Wu
GNNMoE
120
132
0
23 Mar 2022
A Survey on Cross-Lingual Summarization
A Survey on Cross-Lingual Summarization
Jiaan Wang
Fandong Meng
Duo Zheng
Yunlong Liang
Zhixu Li
Jianfeng Qu
Jie Zhou
AILaw
87
62
0
23 Mar 2022
ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through
  Regularized Self-Attention
ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through Regularized Self-Attention
Yang Liu
Jiaxiang Liu
L. Chen
Yuxiang Lu
Shi Feng
Zhida Feng
Yu Sun
Hao Tian
Huancheng Wu
Hai-feng Wang
70
9
0
23 Mar 2022
Pre-training to Match for Unified Low-shot Relation Extraction
Pre-training to Match for Unified Low-shot Relation Extraction
Fangchao Liu
Hongyu Lin
Xianpei Han
Boxi Cao
Le Sun
VLM
45
35
0
23 Mar 2022
A Theoretically Grounded Benchmark for Evaluating Machine Commonsense
A Theoretically Grounded Benchmark for Evaluating Machine Commonsense
Henrique M. Dinis Santos
Ke Shen
Alice M. Mulvehill
Yasaman Razeghi
D. McGuinness
Mayank Kejriwal
ELMLRM
70
4
0
23 Mar 2022
Visual Prompt Tuning
Visual Prompt Tuning
Menglin Jia
Luming Tang
Bor-Chun Chen
Claire Cardie
Serge Belongie
Bharath Hariharan
Ser-Nam Lim
VLMVPVLM
208
1,654
0
23 Mar 2022
Multi-Modal Learning for AU Detection Based on Multi-Head Fused
  Transformers
Multi-Modal Learning for AU Detection Based on Multi-Head Fused Transformers
Xiang Zhang
L. Yin
ViT
71
12
0
22 Mar 2022
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and
  Quantization
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization
Zheng Li
Zijian Wang
Ming Tan
Ramesh Nallapati
Parminder Bhatia
Andrew O. Arnold
Bing Xiang
Dan Roth
MQ
74
44
0
21 Mar 2022
AraBART: a Pretrained Arabic Sequence-to-Sequence Model for Abstractive
  Summarization
AraBART: a Pretrained Arabic Sequence-to-Sequence Model for Abstractive Summarization
Moussa Kamal Eddine
Nadi Tomeh
Nizar Habash
Joseph Le Roux
Michalis Vazirgiannis
75
46
0
21 Mar 2022
Quality Controlled Paraphrase Generation
Quality Controlled Paraphrase Generation
Elron Bandel
R. Aharonov
Michal Shmueli-Scheuer
Ilya Shnayderman
Noam Slonim
L. Ein-Dor
70
38
0
21 Mar 2022
XTREME-S: Evaluating Cross-lingual Speech Representations
XTREME-S: Evaluating Cross-lingual Speech Representations
Alexis Conneau
Ankur Bapna
Yu Zhang
Min Ma
Patrick von Platen
...
Orhan Firat
Michael Auli
Sebastian Ruder
Jason Riesa
Melvin Johnson
VLMAILawELM
155
22
0
21 Mar 2022
HIBRIDS: Attention with Hierarchical Biases for Structure-aware Long
  Document Summarization
HIBRIDS: Attention with Hierarchical Biases for Structure-aware Long Document Summarization
Shuyang Cao
Lu Wang
111
36
0
21 Mar 2022
Compression of Generative Pre-trained Language Models via Quantization
Compression of Generative Pre-trained Language Models via Quantization
Chaofan Tao
Lu Hou
Wei Zhang
Lifeng Shang
Xin Jiang
Qun Liu
Ping Luo
Ngai Wong
MQ
80
104
0
21 Mar 2022
Towards Large-Scale Interpretable Knowledge Graph Reasoning for Dialogue
  Systems
Towards Large-Scale Interpretable Knowledge Graph Reasoning for Dialogue Systems
Yi-Lin Tuan
Sajjad Beygi
Maryam Fazel-Zarandi
Qiaozi Gao
Alessandra Cervone
William Yang Wang
LRM
76
23
0
20 Mar 2022
Cluster & Tune: Boost Cold Start Performance in Text Classification
Cluster & Tune: Boost Cold Start Performance in Text Classification
Eyal Shnarch
Ariel Gera
Alon Halfon
Lena Dankin
Leshem Choshen
R. Aharonov
Noam Slonim
67
22
0
20 Mar 2022
On Robust Prefix-Tuning for Text Classification
On Robust Prefix-Tuning for Text Classification
Zonghan Yang
Yang Liu
VLM
68
21
0
19 Mar 2022
Pretraining with Artificial Language: Studying Transferable Knowledge in
  Language Models
Pretraining with Artificial Language: Studying Transferable Knowledge in Language Models
Ryokan Ri
Yoshimasa Tsuruoka
89
28
0
19 Mar 2022
Sequence-to-Sequence Knowledge Graph Completion and Question Answering
Sequence-to-Sequence Knowledge Graph Completion and Question Answering
Apoorv Saxena
Adrian Kochsiek
Rainer Gemulla
AIMat
133
129
0
19 Mar 2022
FaiRR: Faithful and Robust Deductive Reasoning over Natural Language
FaiRR: Faithful and Robust Deductive Reasoning over Natural Language
Soumya Sanyal
Harman Singh
Xiang Ren
ReLMLRM
106
46
0
19 Mar 2022
ChartQA: A Benchmark for Question Answering about Charts with Visual and
  Logical Reasoning
ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning
Ahmed Masry
Do Xuan Long
J. Tan
Shafiq Joty
Enamul Hoque
AIMat
138
687
0
19 Mar 2022
Challenges and Strategies in Cross-Cultural NLP
Challenges and Strategies in Cross-Cultural NLP
Daniel Hershcovich
Stella Frank
Heather Lent
Miryam de Lhoneux
Mostafa Abdou
...
Ruixiang Cui
Constanza Fierro
Katerina Margatina
Phillip Rust
Anders Søgaard
130
182
0
18 Mar 2022
Prototypical Verbalizer for Prompt-based Few-shot Tuning
Prototypical Verbalizer for Prompt-based Few-shot Tuning
Ganqu Cui
Shengding Hu
Ning Ding
Longtao Huang
Zhiyuan Liu
VLM
67
99
0
18 Mar 2022
HiStruct+: Improving Extractive Text Summarization with Hierarchical
  Structure Information
HiStruct+: Improving Extractive Text Summarization with Hierarchical Structure Information
Qianqian Ruan
Malte Ostendorff
Georg Rehm
AILaw
77
57
0
17 Mar 2022
DP-KB: Data Programming with Knowledge Bases Improves Transformer Fine
  Tuning for Answer Sentence Selection
DP-KB: Data Programming with Knowledge Bases Improves Transformer Fine Tuning for Answer Sentence Selection
Nic Jedema
Thuy Vu
Manish Gupta
Alessandro Moschitti
43
1
0
17 Mar 2022
Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive
  Bias to Sequence-to-sequence Models
Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models
Aaron Mueller
Robert Frank
Tal Linzen
Luheng Wang
Sebastian Schuster
AIMat
97
33
0
17 Mar 2022
EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with
  Large-Scale Pre-Training
EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training
Yuxian Gu
Jiaxin Wen
Hao Sun
Yi Song
Pei Ke
...
Zheng Zhang
Jianzhu Yao
Lei Liu
Xiaoyan Zhu
Minlie Huang
91
55
0
17 Mar 2022
Automating Code Review Activities by Large-Scale Pre-training
Automating Code Review Activities by Large-Scale Pre-training
Zhiyu Li
Shuai Lu
Daya Guo
Nan Duan
Shailesh Jannu
...
Deep Majumder
Jared Green
Alexey Svyatkovskiy
Shengyu Fu
Neel Sundaresan
VLM
106
153
0
17 Mar 2022
Memorizing Transformers
Memorizing Transformers
Yuhuai Wu
M. Rabe
DeLesley S. Hutchins
Christian Szegedy
RALM
109
179
0
16 Mar 2022
A Feasibility Study of Answer-Agnostic Question Generation for Education
A Feasibility Study of Answer-Agnostic Question Generation for Education
Liam Dugan
E. Miltsakaki
Shriyash Upadhyay
Etan Ginsberg
Hannah Gonzalez
Dayheon Choi
Chuning Yuan
Chris Callison-Burch
84
13
0
16 Mar 2022
Geographic Adaptation of Pretrained Language Models
Geographic Adaptation of Pretrained Language Models
Valentin Hofmann
Goran Glavaš
Nikola Ljubevsić
J. Pierrehumbert
Hinrich Schütze
VLM
126
17
0
16 Mar 2022
E-KAR: A Benchmark for Rationalizing Natural Language Analogical
  Reasoning
E-KAR: A Benchmark for Rationalizing Natural Language Analogical Reasoning
Jiangjie Chen
Rui Xu
Ziquan Fu
Wei Shi
Zhongqiao Li
Xinbo Zhang
Changzhi Sun
Lei Li
Yanghua Xiao
Hao Zhou
ELM
74
35
0
16 Mar 2022
FormNet: Structural Encoding beyond Sequential Modeling in Form Document
  Information Extraction
FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction
Chen-Yu Lee
Chun-Liang Li
Timothy Dozat
Vincent Perot
Guolong Su
Nan Hua
Joshua Ainslie
Renshen Wang
Yasuhisa Fujii
Tomas Pfister
96
79
0
16 Mar 2022
Previous
123...164165166...196197198
Next