Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,866 papers shown
Title
A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation
Tianxiang Sun
Xiangyang Liu
Wei-wei Zhu
Zhichao Geng
Lingling Wu
Yilong He
Yuan Ni
Guotong Xie
Xuanjing Huang
Xipeng Qiu
85
41
0
03 Mar 2022
Dialogue Summaries as Dialogue States (DS2), Template-Guided Summarization for Few-shot Dialogue State Tracking
Jamin Shin
Hangyeol Yu
Hyeongdon Moon
Andrea Madotto
Juneyoung Park
84
29
0
03 Mar 2022
Controlling the Focus of Pretrained Language Generation Models
Jiabao Ji
Yoon Kim
James R. Glass
Tianxing He
118
5
0
02 Mar 2022
A Survey on Aspect-Based Sentiment Analysis: Tasks, Methods, and Challenges
Wenxuan Zhang
Xin Li
Yang Deng
Lidong Bing
W. Lam
108
255
0
02 Mar 2022
Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding
Qiaole Dong
Chenjie Cao
Yanwei Fu
CLL
107
139
0
02 Mar 2022
HyperPrompt: Prompt-based Task-Conditioning of Transformers
Yun He
H. Zheng
Yi Tay
Jai Gupta
Yu Du
...
Yaguang Li
Zhaoji Chen
Donald Metzler
Heng-Tze Cheng
Ed H. Chi
LRM
VLM
93
93
0
01 Mar 2022
E-LANG: Energy-Based Joint Inferencing of Super and Swift Language Models
Mohammad Akbari
Amin Banitalebi-Dehkordi
Yong Zhang
69
8
0
01 Mar 2022
Attend, Memorize and Generate: Towards Faithful Table-to-Text Generation in Few Shots
Wenting Zhao
Ye Liu
Yao Wan
Philip S. Yu
68
11
0
01 Mar 2022
DeepNet: Scaling Transformers to 1,000 Layers
Hongyu Wang
Shuming Ma
Li Dong
Shaohan Huang
Dongdong Zhang
Furu Wei
MoE
AI4CE
136
162
0
01 Mar 2022
Read before Generate! Faithful Long Form Question Answering with Machine Reading
Dan Su
Xiaoguang Li
Jindi Zhang
Lifeng Shang
Xin Jiang
Qun Liu
Pascale Fung
HILM
73
61
0
01 Mar 2022
Combining Modular Skills in Multitask Learning
Edoardo Ponti
Alessandro Sordoni
Yoshua Bengio
Siva Reddy
MoE
77
38
0
28 Feb 2022
KMIR: A Benchmark for Evaluating Knowledge Memorization, Identification and Reasoning Abilities of Language Models
Daniel Gao
Yantao Jia
Lei Li
Chengzhen Fu
Zhicheng Dou
Hao Jiang
Xinyu Zhang
Lei Chen
Bo Zhao
KELM
74
8
0
28 Feb 2022
Improving Candidate Retrieval with Entity Profile Generation for Wikidata Entity Linking
T. Lai
Heng Ji
ChengXiang Zhai
RALM
100
12
0
27 Feb 2022
A Generative Model for Relation Extraction and Classification
Jian Ni
Gaetano Rossiello
A. Gliozzo
Radu Florian
42
5
0
26 Feb 2022
A Systematic Evaluation of Large Language Models of Code
Frank F. Xu
Uri Alon
Graham Neubig
Vincent J. Hellendoorn
ELM
ALM
239
661
0
26 Feb 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
Sewon Min
Xinxi Lyu
Ari Holtzman
Mikel Artetxe
M. Lewis
Hannaneh Hajishirzi
Luke Zettlemoyer
LLMAG
LRM
196
1,504
0
25 Feb 2022
Morphology Without Borders: Clause-Level Morphology
Omer Goldman
Reut Tsarfaty
AILaw
69
3
0
25 Feb 2022
Prompt for Extraction? PAIE: Prompting Argument Interaction for Event Argument Extraction
Yubo Ma
Zehao Wang
Yixin Cao
Mukai Li
Meiqi Chen
Kunze Wang
Jing Shao
107
135
0
24 Feb 2022
Cognitive Semantic Communication Systems Driven by Knowledge Graph
Fuhui Zhou
Yihao Li
Xinyuan Zhang
Qihui Wu
Xianfu Lei
R. Hu
104
70
0
24 Feb 2022
Using natural language prompts for machine translation
Xavier Garcia
Orhan Firat
AI4CE
93
33
0
23 Feb 2022
UnifiedQA-v2: Stronger Generalization via Broader Cross-Format Training
Daniel Khashabi
Yeganeh Kordi
Hannaneh Hajishirzi
107
67
0
23 Feb 2022
FastRPB: a Scalable Relative Positional Encoding for Long Sequence Tasks
Maksim Zubkov
Daniil Gavrilov
47
0
0
23 Feb 2022
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers
Alyssa Lees
Vinh Q. Tran
Yi Tay
Jeffrey Scott Sorensen
Jai Gupta
Donald Metzler
Lucy Vasserman
90
193
0
22 Feb 2022
Neural Program Repair: Systems, Challenges and Solutions
Wenkang Zhong
Chuanyi Li
Jidong Ge
B. Luo
71
13
0
22 Feb 2022
Transformer Quality in Linear Time
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
147
232
0
21 Feb 2022
Items from Psychometric Tests as Training Data for Personality Profiling Models of Twitter Users
Anne Kreuter
Kai Sassenberg
Roman Klinger
46
6
0
21 Feb 2022
Ligandformer: A Graph Neural Network for Predicting Compound Property with Robust Interpretation
Jinjiang Guo
Qi Liu
Han Guo
Xi Lu
AI4CE
26
3
0
21 Feb 2022
Contextual Semantic Embeddings for Ontology Subsumption Prediction
Jiaoyan Chen
Yuan He
Yuxia Geng
Ernesto Jiménez-Ruiz
Hang Dong
Ian Horrocks
98
55
0
20 Feb 2022
Reward Modeling for Mitigating Toxicity in Transformer-based Language Models
Farshid Faal
K. Schmitt
Jia Yuan Yu
83
25
0
19 Feb 2022
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
315
376
0
18 Feb 2022
ST-MoE: Designing Stable and Transferable Sparse Expert Models
Barret Zoph
Irwan Bello
Sameer Kumar
Nan Du
Yanping Huang
J. Dean
Noam M. Shazeer
W. Fedus
MoE
218
205
0
17 Feb 2022
SGPT: GPT Sentence Embeddings for Semantic Search
Niklas Muennighoff
RALM
179
190
0
17 Feb 2022
Limitations of Neural Collapse for Understanding Generalization in Deep Learning
Like Hui
M. Belkin
Preetum Nakkiran
AI4CE
65
56
0
17 Feb 2022
Should You Mask 15% in Masked Language Modeling?
Alexander Wettig
Tianyu Gao
Zexuan Zhong
Danqi Chen
CVBM
105
167
0
16 Feb 2022
Probing Pretrained Models of Source Code
Sergey Troshin
Nadezhda Chirkova
ELM
107
39
0
16 Feb 2022
Revisiting Parameter-Efficient Tuning: Are We Really There Yet?
Guanzheng Chen
Fangyu Liu
Zaiqiao Meng
Shangsong Liang
64
95
0
16 Feb 2022
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation
Tao Ge
Si-Qing Chen
Furu Wei
MoE
91
23
0
16 Feb 2022
C
I
S
2
\rm{C {\small IS}}^2
C
IS
2
: A Simplified Commonsense Inference Evaluation for Story Prose
Bryan Li
Lara J. Martin
Chris Callison-Burch
LRM
75
6
0
16 Feb 2022
ITTC @ TREC 2021 Clinical Trials Track
Thinh Hung Truong
Yulia Otmakhova
Rahmad Mahendra
Timothy Baldwin
Jey Han Lau
Trevor Cohn
L. Cavedon
Damiano Spina
Karin Verspoor
46
4
0
16 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
110
66
0
15 Feb 2022
Predicting on the Edge: Identifying Where a Larger Model Does Better
Taman Narayan
Heinrich Jiang
Sen Zhao
Surinder Kumar
79
7
0
15 Feb 2022
Quantifying Memorization Across Neural Language Models
Nicholas Carlini
Daphne Ippolito
Matthew Jagielski
Katherine Lee
Florian Tramèr
Chiyuan Zhang
PILM
133
632
0
15 Feb 2022
MuLD: The Multitask Long Document Benchmark
G. Hudson
Noura Al Moubayed
105
11
0
15 Feb 2022
Saving Dense Retriever from Shortcut Dependency in Conversational Search
Sungdong Kim
Gangwoo Kim
81
27
0
15 Feb 2022
Impact of Pretraining Term Frequencies on Few-Shot Reasoning
Yasaman Razeghi
Robert L Logan IV
Matt Gardner
Sameer Singh
ReLM
LRM
110
157
0
15 Feb 2022
A Survey on Model Compression and Acceleration for Pretrained Language Models
Canwen Xu
Julian McAuley
106
61
0
15 Feb 2022
A Survey on Dynamic Neural Networks for Natural Language Processing
Canwen Xu
Julian McAuley
AI4CE
103
29
0
15 Feb 2022
Discriminability-enforcing loss to improve representation learning
Florinel-Alin Croitoru
Diana-Nicoleta Grigore
Radu Tudor Ionescu
FaML
45
1
0
14 Feb 2022
Transformer Memory as a Differentiable Search Index
Yi Tay
Vinh Q. Tran
Mostafa Dehghani
Jianmo Ni
Dara Bahri
...
Zhe Zhao
Jai Gupta
Tal Schuster
William W. Cohen
Donald Metzler
123
288
0
14 Feb 2022
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark
Jiaxi Gu
Xiaojun Meng
Guansong Lu
Lu Hou
Minzhe Niu
...
Runhu Huang
Wei Zhang
Xingda Jiang
Chunjing Xu
Hang Xu
VLM
158
95
0
14 Feb 2022
Previous
1
2
3
...
166
167
168
...
196
197
198
Next