Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,870 papers shown
Title
Towards Learning Universal Hyperparameter Optimizers with Transformers
Yutian Chen
Xingyou Song
Chansoo Lee
Zehao Wang
Qiuyi Zhang
...
Greg Kochanski
Arnaud Doucet
MarcÁurelio Ranzato
Sagi Perel
Nando de Freitas
105
65
0
26 May 2022
Revisiting Generative Commonsense Reasoning: A Pre-Ordering Approach
Chao Zhao
Faeze Brahman
Tenghao Huang
Snigdha Chaturvedi
LRM
66
5
0
26 May 2022
Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation
Li Mingzhe
Xiexiong Lin
Preslav Nakov
Jinxiong Chang
Qishen Zhang
...
Taifeng Wang
Zhongyi Liu
Wei Chu
Dongyan Zhao
Rui Yan
141
12
0
26 May 2022
BiT: Robustly Binarized Multi-distilled Transformer
Zechun Liu
Barlas Oğuz
Aasish Pappu
Lin Xiao
Scott Yih
Meng Li
Raghuraman Krishnamoorthi
Yashar Mehdad
MQ
123
55
0
25 May 2022
Reasoning over Logically Interacted Conditions for Question Answering
Haitian Sun
William W. Cohen
Ruslan Salakhutdinov
113
6
0
25 May 2022
Eliciting and Understanding Cross-Task Skills with Task-Level Mixture-of-Experts
Qinyuan Ye
Juan Zha
Xiang Ren
MoE
75
14
0
25 May 2022
PLOG: Table-to-Logic Pretraining for Logical Table-to-Text Generation
Ao Liu
Haoyu Dong
Naoaki Okazaki
Shi Han
Dongmei Zhang
LMTD
69
21
0
25 May 2022
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models
Clara Na
Sanket Vaibhav Mehta
Emma Strubell
114
20
0
25 May 2022
ProsocialDialog: A Prosocial Backbone for Conversational Agents
Hyunwoo J. Kim
Youngjae Yu
Liwei Jiang
Ximing Lu
Daniel Khashabi
Gunhee Kim
Yejin Choi
Maarten Sap
112
128
0
25 May 2022
Obj2Sub: Unsupervised Conversion of Objective to Subjective Questions
Aarish Chhabra
Nandini Bansal
Venktesh V
Mukesh Mohania
Deep Dwivedi
23
0
0
25 May 2022
Investigating the Benefits of Free-Form Rationales
Jiao Sun
Swabha Swayamdipta
Jonathan May
Xuezhe Ma
90
15
0
25 May 2022
Discovering Language-neutral Sub-networks in Multilingual Language Models
Negar Foroutan
Mohammadreza Banaei
R. Lebret
Antoine Bosselut
Karl Aberer
LRM
126
25
0
25 May 2022
QAMPARI: An Open-domain Question Answering Benchmark for Questions with Many Answers from Multiple Paragraphs
S. Amouyal
Tomer Wolfson
Ohad Rubin
Ori Yoran
Jonathan Herzig
Jonathan Berant
RALM
VLM
88
27
0
25 May 2022
Few-shot Reranking for Multi-hop QA via Language Model Prompting
Muhammad Khalifa
Lajanugen Logeswaran
Moontae Lee
Ho Hin Lee
Lu Wang
LRM
114
20
0
25 May 2022
Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation
Tu Vu
Aditya Barua
Brian Lester
Daniel Cer
Mohit Iyyer
Noah Constant
CLL
95
66
0
25 May 2022
Asking the Right Questions in Low Resource Template Extraction
Nils Holzenberger
Yunmo Chen
Benjamin Van Durme
95
4
0
25 May 2022
Multimodal Knowledge Alignment with Reinforcement Learning
Youngjae Yu
Jiwan Chung
Heeseung Yun
Jack Hessel
Jinho Park
...
Prithviraj Ammanabrolu
Rowan Zellers
Ronan Le Bras
Gunhee Kim
Yejin Choi
VLM
160
37
0
25 May 2022
Generating Information-Seeking Conversations from Unlabeled Documents
Gangwoo Kim
Sungdong Kim
Kang Min Yoo
Jaewoo Kang
61
13
0
25 May 2022
Leveraging QA Datasets to Improve Generative Data Augmentation
Dheeraj Mekala
Tu Vu
Timo Schick
Jingbo Shang
100
18
0
25 May 2022
ORCA: Interpreting Prompted Language Models via Locating Supporting Data Evidence in the Ocean of Pretraining Data
Xiaochuang Han
Yulia Tsvetkov
97
31
0
25 May 2022
RobustLR: Evaluating Robustness to Logical Perturbation in Deductive Reasoning
Soumya Sanyal
Zeyi Liao
Xiang Ren
ELM
ReLM
LRM
120
21
0
25 May 2022
TAGPRIME: A Unified Framework for Relational Structure Extraction
I-Hung Hsu
Kuan-Hao Huang
Shuning Zhang
Wen-Huang Cheng
Premkumar Natarajan
Kai-Wei Chang
Nanyun Peng
64
14
0
25 May 2022
Gradient-Based Constrained Sampling from Language Models
Sachin Kumar
Biswajit Paria
Yulia Tsvetkov
BDL
99
57
0
25 May 2022
Memorization in NLP Fine-tuning Methods
Fatemehsadat Mireshghallah
Archit Uniyal
Tianhao Wang
David Evans
Taylor Berg-Kirkpatrick
AAML
123
43
0
25 May 2022
Low Resource Style Transfer via Domain Adaptive Meta Learning
Xiangyang Li
Xiang Long
Yu Xia
Sujian Li
58
10
0
25 May 2022
Learning a Better Initialization for Soft Prompts via Meta-Learning
Yukun Huang
Kun Qian
Zhou Yu
VLM
141
9
0
25 May 2022
Recipe for a General, Powerful, Scalable Graph Transformer
Ladislav Rampášek
Mikhail Galkin
Vijay Prakash Dwivedi
Anh Tuan Luu
Guy Wolf
Dominique Beaini
177
582
0
25 May 2022
Generating Natural Language Proofs with Verifier-Guided Search
Kaiyu Yang
Jia Deng
Danqi Chen
LRM
130
72
0
25 May 2022
Counterfactual Data Augmentation improves Factuality of Abstractive Summarization
Dheeraj Rajagopal
Siamak Shakeri
Cicero Nogueira dos Santos
Eduard H. Hovy
Chung-Ching Chang
HILM
125
10
0
25 May 2022
FLUTE: Figurative Language Understanding through Textual Explanations
Tuhin Chakrabarty
Arkadiy Saakyan
Debanjan Ghosh
Smaranda Muresan
117
73
0
24 May 2022
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
James Lee-Thorp
Joshua Ainslie
MoE
94
12
0
24 May 2022
Fine-tuned Language Models are Continual Learners
Thomas Scialom
Tuhin Chakrabarty
Smaranda Muresan
CLL
LRM
211
123
0
24 May 2022
Toxicity Detection with Generative Prompt-based Inference
Yau-Shian Wang
Y. Chang
150
37
0
24 May 2022
Learning to Model Editing Processes
Machel Reid
Graham Neubig
KELM
BDL
187
36
0
24 May 2022
Medical Scientific Table-to-Text Generation with Human-in-the-Loop under the Data Sparsity Constraint
Heng-Yi Wu
Jingqing Zhang
Julia Ive
T. Li
Vibhor Gupta
Bingyuan Chen
Yike Guo
LMTD
MedIm
74
2
0
24 May 2022
TALM: Tool Augmented Language Models
Aaron T Parisi
Yao-Min Zhao
Noah Fiedel
KELM
RALM
LLMAG
108
148
0
24 May 2022
Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing
Linlu Qiu
Peter Shaw
Panupong Pasupat
Tianze Shi
Jonathan Herzig
Emily Pitler
Fei Sha
Kristina Toutanova
AI4CE
LRM
160
54
0
24 May 2022
PoeLM: A Meter- and Rhyme-Controllable Language Model for Unsupervised Poetry Generation
Aitor Ormazabal
Mikel Artetxe
Manex Agirrezabal
Aitor Soroa Etxabe
Eneko Agirre
67
21
0
24 May 2022
RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder
Shitao Xiao
Zheng Liu
Yingxia Shao
Bo Zhao
RALM
278
126
0
24 May 2022
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts
Akari Asai
Mohammadreza Salehi
Matthew E. Peters
Hannaneh Hajishirzi
192
102
0
24 May 2022
Lutma: a Frame-Making Tool for Collaborative FrameNet Development
Tiago Timponi Torrent
Arthur Lorenzi
E. Matos
Frederico Belcavello
Marcelo Viridiano
Maucha Gamonal
55
1
0
24 May 2022
Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model
Sosuke Kobayashi
Shun Kiyono
Jun Suzuki
Kentaro Inui
MoMe
77
9
0
24 May 2022
Lack of Fluency is Hurting Your Translation Model
J. Yoo
Jaewoo Kang
66
0
0
24 May 2022
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations
Jaehun Jung
Lianhui Qin
Sean Welleck
Faeze Brahman
Chandra Bhagavatula
Ronan Le Bras
Yejin Choi
ReLM
LRM
321
197
0
24 May 2022
Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models of Source Code
Changan Niu
Chuanyi Li
Bin Luo
Vincent Ng
SyDa
VLM
107
50
0
24 May 2022
On the Role of Bidirectionality in Language Model Pre-Training
Mikel Artetxe
Jingfei Du
Naman Goyal
Luke Zettlemoyer
Ves Stoyanov
200
17
0
24 May 2022
Workflow Discovery from Dialogues in the Low Data Regime
Amine El Hattami
Stefania Raimondo
I. Laradji
David Vazquez
Pau Rodríguez López
C. Pal
86
11
0
24 May 2022
On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
Shruti Palaskar
Akshita Bhagia
Yonatan Bisk
Florian Metze
A. Black
Ana Marasović
84
4
0
24 May 2022
FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?
Shikhar Tuli
Bhishma Dedhia
Shreshth Tuli
N. Jha
94
14
0
23 May 2022
Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer
Javier Ferrando
Gerard I. Gállego
Belen Alastruey
Carlos Escolano
Marta R. Costa-jussá
174
46
0
23 May 2022
Previous
1
2
3
...
158
159
160
...
196
197
198
Next