Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,883 papers shown
Title
Continuous Active Learning Using Pretrained Transformers
Nima Sadri
G. Cormack
KELM
54
2
0
15 Aug 2022
An Answer Verbalization Dataset for Conversational Question Answerings over Knowledge Graphs
Endri Kacupaj
Kuldeep Singh
M. Maleshkova
Jens Lehmann
52
1
0
13 Aug 2022
A Comprehensive Survey of Natural Language Generation Advances from the Perspective of Digital Deception
Keenan I. Jones
Enes ALTUNCU
V. N. Franqueira
Yi-Chia Wang
Shujun Li
DeLMO
80
3
0
11 Aug 2022
Reducing Retraining by Recycling Parameter-Efficient Prompts
Brian Lester
Joshua Yurtsever
Siamak Shakeri
Noah Constant
51
12
0
10 Aug 2022
CoditT5: Pretraining for Source Code and Natural Language Editing
Jiyang Zhang
Sheena Panthaplackel
Pengyu Nie
Junyi Jessy Li
Miloš Gligorić
KELM
93
93
0
10 Aug 2022
Limitations of Language Models in Arithmetic and Symbolic Induction
Jingu Qian
Hong Wang
Zekun Li
Shiyang Li
Xifeng Yan
ReLM
LRM
139
76
0
09 Aug 2022
An Embarrassingly Easy but Strong Baseline for Nested Named Entity Recognition
Hang Yan
Yu Sun
Xiaonan Li
Xipeng Qiu
77
28
0
09 Aug 2022
A Theoretical View on Sparsely Activated Networks
Cenk Baykal
Nishanth Dikkala
Rina Panigrahy
Cyrus Rashtchian
Xin Wang
28
11
0
08 Aug 2022
Investigating Efficiently Extending Transformers for Long Input Summarization
Jason Phang
Yao-Min Zhao
Peter J. Liu
RALM
LLMAG
85
63
0
08 Aug 2022
Learning Diverse Document Representations with Deep Query Interactions for Dense Retrieval
Zehan Li
Nan Yang
Liang Wang
Furu Wei
77
8
0
08 Aug 2022
Abstractive Meeting Summarization: A Survey
Virgile Rennard
Guokan Shang
Julie Hunter
Michalis Vazirgiannis
101
16
0
08 Aug 2022
Template-based Abstractive Microblog Opinion Summarisation
I. Bilal
Bo Wang
Adam Tsakalidis
Dong Nguyen
Rob Procter
Maria Liakata
65
12
0
08 Aug 2022
Study of Encoder-Decoder Architectures for Code-Mix Search Query Translation
Mandar M. Kulkarni
Soumya Chennabasavaraj
Nikesh Garera
50
3
0
07 Aug 2022
Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models
Margaret Li
Suchin Gururangan
Tim Dettmers
M. Lewis
Tim Althoff
Noah A. Smith
Luke Zettlemoyer
MoMe
110
154
0
05 Aug 2022
Improving Task Generalization via Unified Schema Prompt
Wanjun Zhong
Yifan Gao
Ning Ding
Zhiyuan Liu
Ming Zhou
Jiahai Wang
Jian Yin
Nan Duan
84
8
0
05 Aug 2022
Construction of English Resume Corpus and Test with Pre-trained Language Models
Chengguang Gan
Tatsunori Mori
22
3
0
05 Aug 2022
Low-Resource Dense Retrieval for Open-Domain Question Answering: A Comprehensive Survey
Xiaoyu Shen
Svitlana Vakulenko
Marco Del Tredici
Gianni Barlacchi
Bill Byrne
Adria de Gispert
RALM
VLM
75
20
0
05 Aug 2022
Prompt Tuning for Generative Multimodal Pretrained Models
Han Yang
Junyang Lin
An Yang
Peng Wang
Chang Zhou
Hongxia Yang
VLM
LRM
VPVLM
86
31
0
04 Aug 2022
Introducing BEREL: BERT Embeddings for Rabbinic-Encoded Language
Avi Shmidman
Joshua Guedalia
Shaltiel Shmidman
C. Shmidman
Eli Handel
Moshe Koppel
VLM
24
6
0
03 Aug 2022
Prompt-to-Prompt Image Editing with Cross Attention Control
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
247
1,796
0
02 Aug 2022
ferret: a Framework for Benchmarking Explainers on Transformers
Giuseppe Attanasio
Eliana Pastor
C. Bonaventura
Debora Nozza
90
31
0
02 Aug 2022
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
Saleh Soltan
Shankar Ananthakrishnan
Jack G. M. FitzGerald
Rahul Gupta
Wael Hamza
...
Mukund Sridhar
Fabian Triefenbach
Apurv Verma
Gokhan Tur
Premkumar Natarajan
129
83
0
02 Aug 2022
Multilingual Coreference Resolution in Multiparty Dialogue
Boyuan Zheng
Patrick Xia
M. Yarmohammadi
Benjamin Van Durme
111
4
0
02 Aug 2022
SMART: Sentences as Basic Units for Text Evaluation
Reinald Kim Amplayo
Peter J. Liu
Yao-Min Zhao
Shashi Narayan
79
22
0
01 Aug 2022
Multi-Document Summarization with Centroid-Based Pretraining
Ratish Puduppully
Parag Jain
Nancy F. Chen
Mark Steedman
VLM
70
12
0
01 Aug 2022
Video Question Answering with Iterative Video-Text Co-Tokenization
A. Piergiovanni
K. Morton
Weicheng Kuo
Michael S. Ryoo
A. Angelova
104
18
0
01 Aug 2022
Efficient Long-Text Understanding with Short-Text Models
Maor Ivgi
Uri Shaham
Jonathan Berant
VLM
128
84
0
01 Aug 2022
Thutmose Tagger: Single-pass neural model for Inverse Text Normalization
Alexandra Antonova
Evelina Bakhturina
Boris Ginsburg
10
5
0
29 Jul 2022
"Do you follow me?": A Survey of Recent Approaches in Dialogue State Tracking
Léo Jacqmin
L. Rojas-Barahona
Benoit Favre
100
29
0
29 Jul 2022
Efficient Training of Language Models to Fill in the Middle
Mohammad Bavarian
Heewoo Jun
Nikolas Tezak
John Schulman
C. McLeavey
Jerry Tworek
Mark Chen
94
197
0
28 Jul 2022
Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions
Yanai Elazar
Nora Kassner
Shauli Ravfogel
Amir Feder
Abhilasha Ravichander
Marius Mosbach
Yonatan Belinkov
Hinrich Schütze
Yoav Goldberg
CML
SyDa
MILM
110
55
0
28 Jul 2022
Sequence to sequence pretraining for a less-resourced Slovenian language
Matej Ulčar
Marko Robnik-Šikonja
AIMat
68
17
0
28 Jul 2022
RealTime QA: What's the Answer Right Now?
Jungo Kasai
Keisuke Sakaguchi
Yoichi Takahashi
Ronan Le Bras
Akari Asai
Xinyan Velocity Yu
Dragomir R. Radev
Noah A. Smith
Yejin Choi
Kentaro Inui
KELM
164
194
0
27 Jul 2022
Controllable User Dialogue Act Augmentation for Dialogue State Tracking
Chun-Mao Lai
Ming-Hao Hsu
Chao-Wei Huang
Yun-Nung Chen
68
6
0
26 Jul 2022
Fine-Tuning BERT for Automatic ADME Semantic Labeling in FDA Drug Labeling to Enhance Product-Specific Guidance Assessment
Yiwen Shi
Jing Wang
Ping Ren
Taha ValizadehAslani
Yi Zhang
Meng Hu
Hualou Liang
AI4MH
AAML
73
17
0
25 Jul 2022
Advancing Semi-Supervised Task Oriented Dialog Systems by JSA Learning of Discrete Latent Variable Models
Yucheng Cai
Hong Liu
Zhijian Ou
Y. Huang
Junlan Feng
BDL
57
3
0
25 Jul 2022
Neurosymbolic Repair for Low-Code Formula Languages
Rohan Bavishi
Harshit Joshi
José Pablo Cambronero Sánchez
Anna Fariha
Sumit Gulwani
Vu Le
Ivan Radicek
A. Tiwari
65
13
0
24 Jul 2022
A Transformer-based Neural Language Model that Synthesizes Brain Activation Maps from Free-Form Text Queries
G. Ngo
Minh Le Nguyen
Nancy F. Chen
M. Sabuncu
MedIm
47
7
0
24 Jul 2022
No More Fine-Tuning? An Experimental Evaluation of Prompt Tuning in Code Intelligence
Chaozheng Wang
Yuanhang Yang
Cuiyun Gao
Yun Peng
Hongyu Zhang
Michael R. Lyu
AAML
115
145
0
24 Jul 2022
Generalized Attention Mechanism and Relative Position for Transformer
R. Pandya
ViT
18
1
0
24 Jul 2022
Context based lemmatizer for Polish language
Michał Karwatowski
Marcin Pietroñ
28
1
0
23 Jul 2022
High-Resolution Swin Transformer for Automatic Medical Image Segmentation
Chen Wei
Shenghan Ren
Kaitai Guo
Haihong Hu
Jimin Liang
ViT
OOD
MedIm
59
43
0
23 Jul 2022
PanGu-Coder: Program Synthesis with Function-Level Language Modeling
Fenia Christopoulou
Gerasimos Lampouras
Milan Gritta
Guchun Zhang
Yinpeng Guo
...
Guangtai Liang
Jia Wei
Xin Jiang
Qianxiang Wang
Qun Liu
ELM
SyDa
ALM
109
76
0
22 Jul 2022
Two-Stage Fine-Tuning: A Novel Strategy for Learning Class-Imbalanced Data
Taha ValizadehAslani
Yiwen Shi
Jing Wang
Ping Ren
Yi Zhang
Meng Hu
Lianggong Zhao
Hualou Liang
68
7
0
22 Jul 2022
Efficient model compression with Random Operation Access Specific Tile (ROAST) hashing
Aditya Desai
K. Zhou
Anshumali Shrivastava
43
1
0
21 Jul 2022
Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?
Yi Tay
Mostafa Dehghani
Samira Abnar
Hyung Won Chung
W. Fedus
J. Rao
Sharan Narang
Vinh Q. Tran
Dani Yogatama
Donald Metzler
AI4CE
125
107
0
21 Jul 2022
CodeT: Code Generation with Generated Tests
Bei Chen
Fengji Zhang
A. Nguyen
Daoguang Zan
Zeqi Lin
Jian-Guang Lou
Weizhu Chen
105
349
0
21 Jul 2022
NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis
Chenfei Wu
Jian Liang
Xiaowei Hu
Zhe Gan
Jianfeng Wang
Lijuan Wang
Zicheng Liu
Yuejian Fang
Nan Duan
VGen
89
74
0
20 Jul 2022
Doge Tickets: Uncovering Domain-general Language Models by Playing Lottery Tickets
Yi Yang
Chen Zhang
Benyou Wang
Dawei Song
LRM
100
6
0
20 Jul 2022
MoEC: Mixture of Expert Clusters
Yuan Xie
Shaohan Huang
Tianyu Chen
Furu Wei
MoE
91
11
0
19 Jul 2022
Previous
1
2
3
...
154
155
156
...
196
197
198
Next