ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,870 papers shown
Title
Towards Learning Universal Hyperparameter Optimizers with Transformers
Towards Learning Universal Hyperparameter Optimizers with Transformers
Yutian Chen
Xingyou Song
Chansoo Lee
Zehao Wang
Qiuyi Zhang
...
Greg Kochanski
Arnaud Doucet
MarcÁurelio Ranzato
Sagi Perel
Nando de Freitas
105
65
0
26 May 2022
Revisiting Generative Commonsense Reasoning: A Pre-Ordering Approach
Revisiting Generative Commonsense Reasoning: A Pre-Ordering Approach
Chao Zhao
Faeze Brahman
Tenghao Huang
Snigdha Chaturvedi
LRM
66
5
0
26 May 2022
Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation
Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation
Li Mingzhe
Xiexiong Lin
Preslav Nakov
Jinxiong Chang
Qishen Zhang
...
Taifeng Wang
Zhongyi Liu
Wei Chu
Dongyan Zhao
Rui Yan
141
12
0
26 May 2022
BiT: Robustly Binarized Multi-distilled Transformer
BiT: Robustly Binarized Multi-distilled Transformer
Zechun Liu
Barlas Oğuz
Aasish Pappu
Lin Xiao
Scott Yih
Meng Li
Raghuraman Krishnamoorthi
Yashar Mehdad
MQ
123
55
0
25 May 2022
Reasoning over Logically Interacted Conditions for Question Answering
Reasoning over Logically Interacted Conditions for Question Answering
Haitian Sun
William W. Cohen
Ruslan Salakhutdinov
113
6
0
25 May 2022
Eliciting and Understanding Cross-Task Skills with Task-Level
  Mixture-of-Experts
Eliciting and Understanding Cross-Task Skills with Task-Level Mixture-of-Experts
Qinyuan Ye
Juan Zha
Xiang Ren
MoE
75
14
0
25 May 2022
PLOG: Table-to-Logic Pretraining for Logical Table-to-Text Generation
PLOG: Table-to-Logic Pretraining for Logical Table-to-Text Generation
Ao Liu
Haoyu Dong
Naoaki Okazaki
Shi Han
Dongmei Zhang
LMTD
69
21
0
25 May 2022
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More
  Compressible Models
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models
Clara Na
Sanket Vaibhav Mehta
Emma Strubell
114
20
0
25 May 2022
ProsocialDialog: A Prosocial Backbone for Conversational Agents
ProsocialDialog: A Prosocial Backbone for Conversational Agents
Hyunwoo J. Kim
Youngjae Yu
Liwei Jiang
Ximing Lu
Daniel Khashabi
Gunhee Kim
Yejin Choi
Maarten Sap
112
128
0
25 May 2022
Obj2Sub: Unsupervised Conversion of Objective to Subjective Questions
Obj2Sub: Unsupervised Conversion of Objective to Subjective Questions
Aarish Chhabra
Nandini Bansal
Venktesh V
Mukesh Mohania
Deep Dwivedi
23
0
0
25 May 2022
Investigating the Benefits of Free-Form Rationales
Investigating the Benefits of Free-Form Rationales
Jiao Sun
Swabha Swayamdipta
Jonathan May
Xuezhe Ma
90
15
0
25 May 2022
Discovering Language-neutral Sub-networks in Multilingual Language
  Models
Discovering Language-neutral Sub-networks in Multilingual Language Models
Negar Foroutan
Mohammadreza Banaei
R. Lebret
Antoine Bosselut
Karl Aberer
LRM
126
25
0
25 May 2022
QAMPARI: An Open-domain Question Answering Benchmark for Questions with
  Many Answers from Multiple Paragraphs
QAMPARI: An Open-domain Question Answering Benchmark for Questions with Many Answers from Multiple Paragraphs
S. Amouyal
Tomer Wolfson
Ohad Rubin
Ori Yoran
Jonathan Herzig
Jonathan Berant
RALMVLM
88
27
0
25 May 2022
Few-shot Reranking for Multi-hop QA via Language Model Prompting
Few-shot Reranking for Multi-hop QA via Language Model Prompting
Muhammad Khalifa
Lajanugen Logeswaran
Moontae Lee
Ho Hin Lee
Lu Wang
LRM
114
20
0
25 May 2022
Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation
Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation
Tu Vu
Aditya Barua
Brian Lester
Daniel Cer
Mohit Iyyer
Noah Constant
CLL
95
66
0
25 May 2022
Asking the Right Questions in Low Resource Template Extraction
Asking the Right Questions in Low Resource Template Extraction
Nils Holzenberger
Yunmo Chen
Benjamin Van Durme
95
4
0
25 May 2022
Multimodal Knowledge Alignment with Reinforcement Learning
Multimodal Knowledge Alignment with Reinforcement Learning
Youngjae Yu
Jiwan Chung
Heeseung Yun
Jack Hessel
Jinho Park
...
Prithviraj Ammanabrolu
Rowan Zellers
Ronan Le Bras
Gunhee Kim
Yejin Choi
VLM
160
37
0
25 May 2022
Generating Information-Seeking Conversations from Unlabeled Documents
Generating Information-Seeking Conversations from Unlabeled Documents
Gangwoo Kim
Sungdong Kim
Kang Min Yoo
Jaewoo Kang
61
13
0
25 May 2022
Leveraging QA Datasets to Improve Generative Data Augmentation
Leveraging QA Datasets to Improve Generative Data Augmentation
Dheeraj Mekala
Tu Vu
Timo Schick
Jingbo Shang
100
18
0
25 May 2022
ORCA: Interpreting Prompted Language Models via Locating Supporting Data
  Evidence in the Ocean of Pretraining Data
ORCA: Interpreting Prompted Language Models via Locating Supporting Data Evidence in the Ocean of Pretraining Data
Xiaochuang Han
Yulia Tsvetkov
97
31
0
25 May 2022
RobustLR: Evaluating Robustness to Logical Perturbation in Deductive
  Reasoning
RobustLR: Evaluating Robustness to Logical Perturbation in Deductive Reasoning
Soumya Sanyal
Zeyi Liao
Xiang Ren
ELMReLMLRM
120
21
0
25 May 2022
TAGPRIME: A Unified Framework for Relational Structure Extraction
TAGPRIME: A Unified Framework for Relational Structure Extraction
I-Hung Hsu
Kuan-Hao Huang
Shuning Zhang
Wen-Huang Cheng
Premkumar Natarajan
Kai-Wei Chang
Nanyun Peng
64
14
0
25 May 2022
Gradient-Based Constrained Sampling from Language Models
Gradient-Based Constrained Sampling from Language Models
Sachin Kumar
Biswajit Paria
Yulia Tsvetkov
BDL
99
57
0
25 May 2022
Memorization in NLP Fine-tuning Methods
Memorization in NLP Fine-tuning Methods
Fatemehsadat Mireshghallah
Archit Uniyal
Tianhao Wang
David Evans
Taylor Berg-Kirkpatrick
AAML
123
43
0
25 May 2022
Low Resource Style Transfer via Domain Adaptive Meta Learning
Low Resource Style Transfer via Domain Adaptive Meta Learning
Xiangyang Li
Xiang Long
Yu Xia
Sujian Li
58
10
0
25 May 2022
Learning a Better Initialization for Soft Prompts via Meta-Learning
Learning a Better Initialization for Soft Prompts via Meta-Learning
Yukun Huang
Kun Qian
Zhou Yu
VLM
141
9
0
25 May 2022
Recipe for a General, Powerful, Scalable Graph Transformer
Recipe for a General, Powerful, Scalable Graph Transformer
Ladislav Rampášek
Mikhail Galkin
Vijay Prakash Dwivedi
Anh Tuan Luu
Guy Wolf
Dominique Beaini
177
582
0
25 May 2022
Generating Natural Language Proofs with Verifier-Guided Search
Generating Natural Language Proofs with Verifier-Guided Search
Kaiyu Yang
Jia Deng
Danqi Chen
LRM
130
72
0
25 May 2022
Counterfactual Data Augmentation improves Factuality of Abstractive
  Summarization
Counterfactual Data Augmentation improves Factuality of Abstractive Summarization
Dheeraj Rajagopal
Siamak Shakeri
Cicero Nogueira dos Santos
Eduard H. Hovy
Chung-Ching Chang
HILM
125
10
0
25 May 2022
FLUTE: Figurative Language Understanding through Textual Explanations
FLUTE: Figurative Language Understanding through Textual Explanations
Tuhin Chakrabarty
Arkadiy Saakyan
Debanjan Ghosh
Smaranda Muresan
117
73
0
24 May 2022
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
James Lee-Thorp
Joshua Ainslie
MoE
94
12
0
24 May 2022
Fine-tuned Language Models are Continual Learners
Fine-tuned Language Models are Continual Learners
Thomas Scialom
Tuhin Chakrabarty
Smaranda Muresan
CLLLRM
211
123
0
24 May 2022
Toxicity Detection with Generative Prompt-based Inference
Toxicity Detection with Generative Prompt-based Inference
Yau-Shian Wang
Y. Chang
150
37
0
24 May 2022
Learning to Model Editing Processes
Learning to Model Editing Processes
Machel Reid
Graham Neubig
KELMBDL
187
36
0
24 May 2022
Medical Scientific Table-to-Text Generation with Human-in-the-Loop under
  the Data Sparsity Constraint
Medical Scientific Table-to-Text Generation with Human-in-the-Loop under the Data Sparsity Constraint
Heng-Yi Wu
Jingqing Zhang
Julia Ive
T. Li
Vibhor Gupta
Bingyuan Chen
Yike Guo
LMTDMedIm
74
2
0
24 May 2022
TALM: Tool Augmented Language Models
TALM: Tool Augmented Language Models
Aaron T Parisi
Yao-Min Zhao
Noah Fiedel
KELMRALMLLMAG
108
148
0
24 May 2022
Evaluating the Impact of Model Scale for Compositional Generalization in
  Semantic Parsing
Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing
Linlu Qiu
Peter Shaw
Panupong Pasupat
Tianze Shi
Jonathan Herzig
Emily Pitler
Fei Sha
Kristina Toutanova
AI4CELRM
160
54
0
24 May 2022
PoeLM: A Meter- and Rhyme-Controllable Language Model for Unsupervised
  Poetry Generation
PoeLM: A Meter- and Rhyme-Controllable Language Model for Unsupervised Poetry Generation
Aitor Ormazabal
Mikel Artetxe
Manex Agirrezabal
Aitor Soroa Etxabe
Eneko Agirre
67
21
0
24 May 2022
RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked
  Auto-Encoder
RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder
Shitao Xiao
Zheng Liu
Yingxia Shao
Bo Zhao
RALM
278
126
0
24 May 2022
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures
  of Soft Prompts
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts
Akari Asai
Mohammadreza Salehi
Matthew E. Peters
Hannaneh Hajishirzi
192
102
0
24 May 2022
Lutma: a Frame-Making Tool for Collaborative FrameNet Development
Lutma: a Frame-Making Tool for Collaborative FrameNet Development
Tiago Timponi Torrent
Arthur Lorenzi
E. Matos
Frederico Belcavello
Marcelo Viridiano
Maucha Gamonal
55
1
0
24 May 2022
Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model
Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model
Sosuke Kobayashi
Shun Kiyono
Jun Suzuki
Kentaro Inui
MoMe
77
9
0
24 May 2022
Lack of Fluency is Hurting Your Translation Model
Lack of Fluency is Hurting Your Translation Model
J. Yoo
Jaewoo Kang
66
0
0
24 May 2022
Maieutic Prompting: Logically Consistent Reasoning with Recursive
  Explanations
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations
Jaehun Jung
Lianhui Qin
Sean Welleck
Faeze Brahman
Chandra Bhagavatula
Ronan Le Bras
Yejin Choi
ReLMLRM
321
197
0
24 May 2022
Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models
  of Source Code
Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models of Source Code
Changan Niu
Chuanyi Li
Bin Luo
Vincent Ng
SyDaVLM
107
50
0
24 May 2022
On the Role of Bidirectionality in Language Model Pre-Training
On the Role of Bidirectionality in Language Model Pre-Training
Mikel Artetxe
Jingfei Du
Naman Goyal
Luke Zettlemoyer
Ves Stoyanov
200
17
0
24 May 2022
Workflow Discovery from Dialogues in the Low Data Regime
Workflow Discovery from Dialogues in the Low Data Regime
Amine El Hattami
Stefania Raimondo
I. Laradji
David Vazquez
Pau Rodríguez López
C. Pal
86
11
0
24 May 2022
On Advances in Text Generation from Images Beyond Captioning: A Case
  Study in Self-Rationalization
On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
Shruti Palaskar
Akshita Bhagia
Yonatan Bisk
Florian Metze
A. Black
Ana Marasović
84
4
0
24 May 2022
FlexiBERT: Are Current Transformer Architectures too Homogeneous and
  Rigid?
FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?
Shikhar Tuli
Bhishma Dedhia
Shreshth Tuli
N. Jha
94
14
0
23 May 2022
Towards Opening the Black Box of Neural Machine Translation: Source and
  Target Interpretations of the Transformer
Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer
Javier Ferrando
Gerard I. Gállego
Belen Alastruey
Carlos Escolano
Marta R. Costa-jussá
174
46
0
23 May 2022
Previous
123...158159160...196197198
Next