Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,877 papers shown
Title
Controlling Conditional Language Models without Catastrophic Forgetting
Tomasz Korbak
Hady ElSahar
Germán Kruszewski
Marc Dymetman
CLL
AI4CE
115
35
0
01 Dec 2021
Towards More Robust Natural Language Understanding
Xinliang Frederick Zhang
65
2
0
01 Dec 2021
Systematic Generalization with Edge Transformers
Leon Bergen
Timothy J. O'Donnell
Dzmitry Bahdanau
65
47
0
01 Dec 2021
NER-BERT: A Pre-trained Model for Low-Resource Entity Tagging
Zihan Liu
Feijun Jiang
Yuxiang Hu
Chen Shi
Pascale Fung
126
38
0
01 Dec 2021
Wiki to Automotive: Understanding the Distribution Shift and its impact on Named Entity Recognition
Anmol Nayak
Hariprasad Timmapathini
OOD
35
3
0
01 Dec 2021
Show Your Work: Scratchpads for Intermediate Computation with Language Models
Maxwell Nye
Anders Andreassen
Guy Gur-Ari
Henryk Michalewski
Jacob Austin
...
Aitor Lewkowycz
Maarten Bosma
D. Luan
Charles Sutton
Augustus Odena
ReLM
LRM
215
757
0
30 Nov 2021
Dyna-bAbI: unlocking bAbI's potential with dynamic synthetic benchmarking
Ronen Tamari
Kyle Richardson
Aviad Sar-Shalom
Noam Kahlon
Nelson F. Liu
Reut Tsarfaty
Dafna Shahaf
115
5
0
30 Nov 2021
Refined Commonsense Knowledge from Large-Scale Web Contents
Shrestha Ghosh
Simon Razniewski
Julien Romero
Gerhard Weikum
86
32
0
30 Nov 2021
Action based Network for Conversation Question Reformulation
Zheyu Ye
Jiang Liu
Qian Yu
Jianxun Ju
51
0
0
29 Nov 2021
PSG: Prompt-based Sequence Generation for Acronym Extraction
Bin Li
Fei Xia
Yi-Zhong Weng
Xiusheng Huang
Bin Sun
Shutao Li
57
6
0
29 Nov 2021
Long-range and hierarchical language predictions in brains and algorithms
Charlotte Caucheteux
Alexandre Gramfort
J. King
52
22
0
28 Nov 2021
True Few-Shot Learning with Prompts -- A Real-World Perspective
Timo Schick
Hinrich Schütze
VLM
113
64
0
26 Nov 2021
Transformer-based Korean Pretrained Language Models: A Survey on Three Years of Progress
Kichang Yang
KELM
VLM
79
12
0
25 Nov 2021
PolyViT: Co-training Vision Transformers on Images, Videos and Audio
Valerii Likhosherstov
Anurag Arnab
K. Choromanski
Mario Lucic
Yi Tay
Adrian Weller
Mostafa Dehghani
ViT
110
75
0
25 Nov 2021
Less is More: Generating Grounded Navigation Instructions from Landmarks
Su Wang
Ceslee Montgomery
Jordi Orbay
Vighnesh Birodkar
Aleksandra Faust
Izzeddin Gur
Natasha Jaques
Austin Waters
Jason Baldridge
Peter Anderson
135
64
0
25 Nov 2021
Sparse is Enough in Scaling Transformers
Sebastian Jaszczur
Aakanksha Chowdhery
Afroz Mohiuddin
Lukasz Kaiser
Wojciech Gajewski
Henryk Michalewski
Jonni Kanerva
MoE
71
102
0
24 Nov 2021
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
Baining Guo
ViT
150
245
0
24 Nov 2021
VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling
Tsu-Jui Fu
Linjie Li
Zhe Gan
Kevin Qinghong Lin
Wenjie Wang
Lijuan Wang
Zicheng Liu
VLM
148
221
0
24 Nov 2021
Knowledge Enhanced Sports Game Summarization
Jiaan Wang
Zhixu Li
Tingyi Zhang
Duo Zheng
Jianfeng Qu
An Liu
Lei Zhao
Zhigang Chen
AI4TS
75
12
0
24 Nov 2021
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Faisal Ahmed
Zicheng Liu
Yumao Lu
Lijuan Wang
146
117
0
23 Nov 2021
Learning Symbolic Rules for Reasoning in Quasi-Natural Language
Kaiyu Yang
Jia Deng
NAI
ReLM
LRM
61
13
0
23 Nov 2021
Realistic simulation of users for IT systems in cyber ranges
Alexandre Dey
Benjamin Costé
Éric Totel
Adrien Bécue
19
1
0
23 Nov 2021
Benchmarking Detection Transfer Learning with Vision Transformers
Yanghao Li
Saining Xie
Xinlei Chen
Piotr Dollar
Kaiming He
Ross B. Girshick
113
170
0
22 Nov 2021
DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion
Arthur Douillard
Alexandre Ramé
Guillaume Couairon
Matthieu Cord
CLL
149
313
0
22 Nov 2021
Many Heads but One Brain: Fusion Brain -- a Competition and a Single Multimodal Multitask Architecture
Daria Bakshandaeva
Denis Dimitrov
V.Ya. Arkhipkin
Alex Shonenkov
M. Potanin
...
Mikhail Martynov
Anton Voronov
Vera Davydova
E. Tutubalina
Aleksandr Petiushko
99
0
0
22 Nov 2021
ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
V. Aribandi
Yi Tay
Tal Schuster
J. Rao
H. Zheng
...
Jianmo Ni
Jai Gupta
Kai Hui
Sebastian Ruder
Donald Metzler
MoE
125
216
0
22 Nov 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
284
1,839
0
18 Nov 2021
Training Neural Networks with Fixed Sparse Masks
Yi-Lin Sung
Varun Nair
Colin Raffel
FedML
109
209
0
18 Nov 2021
Merging Models with Fisher-Weighted Averaging
Michael Matena
Colin Raffel
FedML
MoMe
107
403
0
18 Nov 2021
LAnoBERT: System Log Anomaly Detection based on BERT Masked Language Model
Yukyung Lee
Jina Kim
Pilsung Kang
64
84
0
18 Nov 2021
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
Pengcheng He
Jianfeng Gao
Weizhu Chen
239
1,212
0
18 Nov 2021
RoBERTuito: a pre-trained language model for social media text in Spanish
Juan Manuel Pérez
D. Furman
Laura Alonso Alemany
Franco Luque
75
100
0
18 Nov 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
114
713
0
17 Nov 2021
INTERN: A New Learning Paradigm Towards General Vision
Jing Shao
Siyu Chen
Yangguang Li
Kun Wang
Zhen-fei Yin
...
F. Yu
Junjie Yan
Dahua Lin
Xiaogang Wang
Yu Qiao
110
34
0
16 Nov 2021
A Comparative Study on Transfer Learning and Distance Metrics in Semantic Clustering over the COVID-19 Tweets
E. Zafarani-Moattar
M. Kangavari
A. Rahmani
64
2
0
16 Nov 2021
Document AI: Benchmarks, Models and Applications
Lei Cui
Yiheng Xu
Tengchao Lv
Furu Wei
VLM
89
74
0
16 Nov 2021
Few-Shot Self-Rationalization with Natural Language Prompts
Ana Marasović
Iz Beltagy
Doug Downey
Matthew E. Peters
LRM
91
110
0
16 Nov 2021
Adversarially Constructed Evaluation Sets Are More Challenging, but May Not Be Fair
Jason Phang
Angelica Chen
William Huang
Samuel R. Bowman
AAML
76
14
0
16 Nov 2021
Solving Linear Algebra by Program Synthesis
Iddo Drori
Nakul Verma
33
21
0
16 Nov 2021
LiT: Zero-Shot Transfer with Locked-image text Tuning
Xiaohua Zhai
Tianlin Li
Basil Mustafa
Andreas Steiner
Daniel Keysers
Alexander Kolesnikov
Lucas Beyer
VLM
168
561
0
15 Nov 2021
Say What? Collaborative Pop Lyric Generation Using Multitask Transfer Learning
Naveen Ram
Tanay Gummadi
Rahul Bhethanabotla
Richard J. Savery
Gil Weinberg
51
9
0
15 Nov 2021
Time Waits for No One! Analysis and Challenges of Temporal Misalignment
Kelvin Luu
Daniel Khashabi
Suchin Gururangan
Karishma Mandyam
Noah A. Smith
113
92
0
14 Nov 2021
On Transferability of Prompt Tuning for Natural Language Processing
Yusheng Su
Xiaozhi Wang
Yujia Qin
Chi-Min Chan
Yankai Lin
...
Peng Li
Juanzi Li
Lei Hou
Maosong Sun
Jie Zhou
AAML
VLM
83
106
0
12 Nov 2021
Extraction of Medication Names from Twitter Using Augmentation and an Ensemble of Language Models
I. Kulev
Berkay Köprü
Raul Rodriguez-Esteban
Diego Saldana Miranda
Yi Huang
Alessandro La Torraca
Elif Özkirimli
MedIm
47
4
0
12 Nov 2021
AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization
Alexander R. Fabbri
Xiaojian Wu
Srini Iyer
Haoran Li
Mona T. Diab
63
15
0
11 Nov 2021
SynthBio: A Case Study in Human-AI Collaborative Curation of Text Datasets
Ann Yuan
Daphne Ippolito
Vitaly Nikolaev
Chris Callison-Burch
Andy Coenen
Sebastian Gehrmann
SyDa
190
23
0
11 Nov 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
612
7,866
0
11 Nov 2021
Amazon SageMaker Model Parallelism: A General and Flexible Framework for Large Model Training
C. Karakuş
R. Huilgol
Leilei Gan
Anirudh Subramanian
Cade Daniel
D. Çavdar
Teng Xu
Haohan Chen
Arash Rahnama
L. Quintela
MoE
AI4CE
58
29
0
10 Nov 2021
Recent Advances in Automated Question Answering In Biomedical Domain
K. D. Baksi
64
0
0
10 Nov 2021
AW-Opt: Learning Robotic Skills with Imitation and Reinforcement at Scale
Yao Lu
Karol Hausman
Yevgen Chebotar
Mengyuan Yan
Eric Jang
...
Ted Xiao
A. Irpan
Mohi Khansari
Dmitry Kalashnikov
Sergey Levine
OffRL
197
61
0
09 Nov 2021
Previous
1
2
3
...
171
172
173
...
196
197
198
Next