ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,870 papers shown
Title
Bayesian Active Summarization
Bayesian Active Summarization
Alexios Gidiotis
Grigorios Tsoumakas
BDL
87
7
0
09 Oct 2021
RankingMatch: Delving into Semi-Supervised Learning with Consistency
  Regularization and Ranking Loss
RankingMatch: Delving into Semi-Supervised Learning with Consistency Regularization and Ranking Loss
Trung Q. Tran
Mingu Kang
Daeyoung Kim
35
2
0
09 Oct 2021
Towards a Unified View of Parameter-Efficient Transfer Learning
Towards a Unified View of Parameter-Efficient Transfer Learning
Junxian He
Chunting Zhou
Xuezhe Ma
Taylor Berg-Kirkpatrick
Graham Neubig
AAML
178
958
0
08 Oct 2021
KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain
  Question Answering
KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering
Donghan Yu
Chenguang Zhu
Yuwei Fang
Wenhao Yu
Shuohang Wang
Yichong Xu
Xiang Ren
Yiming Yang
Michael Zeng
93
90
0
08 Oct 2021
Taming Sparsely Activated Transformer with Stochastic Experts
Taming Sparsely Activated Transformer with Stochastic Experts
Simiao Zuo
Xiaodong Liu
Jian Jiao
Young Jin Kim
Hany Hassan
Ruofei Zhang
T. Zhao
Jianfeng Gao
MoE
123
115
0
08 Oct 2021
Iterative Decoding for Compositional Generalization in Transformers
Iterative Decoding for Compositional Generalization in Transformers
Luana Ruiz
Joshua Ainslie
Santiago Ontañón
65
6
0
08 Oct 2021
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion
  Parameter Pretraining
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining
Junyang Lin
An Yang
Jinze Bai
Chang Zhou
Le Jiang
...
Jie Zhang
Yong Li
Wei Lin
Jingren Zhou
Hongxia Yang
MoE
163
43
0
08 Oct 2021
Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Shuo Yang
Le Hou
Xiaodan Song
Qiang Liu
Denny Zhou
150
9
0
08 Oct 2021
Noisy Text Data: Achilles' Heel of popular transformer based NLP models
Noisy Text Data: Achilles' Heel of popular transformer based NLP models
Kartikay Bagla
Ankit Kumar
Shivam Gupta
Anuj Gupta
51
6
0
07 Oct 2021
Towards Continual Knowledge Learning of Language Models
Towards Continual Knowledge Learning of Language Models
Joel Jang
Seonghyeon Ye
Sohee Yang
Joongbo Shin
Janghoon Han
Gyeonghun Kim
Stanley Jungkyu Choi
Minjoon Seo
CLLKELM
303
161
0
07 Oct 2021
Cut the CARP: Fishing for zero-shot story evaluation
Cut the CARP: Fishing for zero-shot story evaluation
Shahbuland Matiana
J. Smith
Ryan Teehan
Louis Castricato
Stella Biderman
Leo Gao
Spencer Frazier
122
16
0
06 Oct 2021
Capturing Structural Locality in Non-parametric Language Models
Capturing Structural Locality in Non-parametric Language Models
Frank F. Xu
Junxian He
Graham Neubig
Vincent J. Hellendoorn
112
14
0
06 Oct 2021
8-bit Optimizers via Block-wise Quantization
8-bit Optimizers via Block-wise Quantization
Tim Dettmers
M. Lewis
Sam Shleifer
Luke Zettlemoyer
MQ
152
305
0
06 Oct 2021
How BPE Affects Memorization in Transformers
How BPE Affects Memorization in Transformers
Eugene Kharitonov
Marco Baroni
Dieuwke Hupkes
247
33
0
06 Oct 2021
Language Modeling using LMUs: 10x Better Data Efficiency or Improved
  Scaling Compared to Transformers
Language Modeling using LMUs: 10x Better Data Efficiency or Improved Scaling Compared to Transformers
Narsimha Chilkuri
Eric Hunsberger
Aaron R. Voelker
G. Malik
C. Eliasmith
74
7
0
05 Oct 2021
Leveraging the Inductive Bias of Large Language Models for Abstract
  Textual Reasoning
Leveraging the Inductive Bias of Large Language Models for Abstract Textual Reasoning
Christopher Rytting
David Wingate
AI4CELRM
79
27
0
05 Oct 2021
COVIDRead: A Large-scale Question Answering Dataset on COVID-19
COVIDRead: A Large-scale Question Answering Dataset on COVID-19
Tanik Saikh
Sovan Kumar Sahoo
Asif Ekbal
P. Bhattacharyya
61
5
0
05 Oct 2021
Data Augmentation Approaches in Natural Language Processing: A Survey
Data Augmentation Approaches in Natural Language Processing: A Survey
Bohan Li
Yutai Hou
Wanxiang Che
219
284
0
05 Oct 2021
MoEfication: Transformer Feed-forward Layers are Mixtures of Experts
MoEfication: Transformer Feed-forward Layers are Mixtures of Experts
Zhengyan Zhang
Yankai Lin
Zhiyuan Liu
Peng Li
Maosong Sun
Jie Zhou
MoE
109
129
0
05 Oct 2021
Perhaps PTLMs Should Go to School -- A Task to Assess Open Book and
  Closed Book QA
Perhaps PTLMs Should Go to School -- A Task to Assess Open Book and Closed Book QA
Manuel R. Ciosici
Joe Cecil
Alex Hedges
Dong-Ho Lee
Marjorie Freedman
R. Weischedel
58
10
0
04 Oct 2021
Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics
Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics
Prajjwal Bhargava
Aleksandr Drozd
Anna Rogers
159
108
0
04 Oct 2021
Skill Induction and Planning with Latent Language
Skill Induction and Planning with Latent Language
Pratyusha Sharma
Antonio Torralba
Jacob Andreas
LM&Ro
263
112
0
04 Oct 2021
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
Ilias Chalkidis
Abhik Jana
D. Hartung
M. Bommarito
Ion Androutsopoulos
Daniel Martin Katz
Nikolaos Aletras
AILawELM
288
267
0
03 Oct 2021
Aspect Sentiment Quad Prediction as Paraphrase Generation
Aspect Sentiment Quad Prediction as Paraphrase Generation
Wenxuan Zhang
Yang Deng
Xin Li
Yifei Yuan
Lidong Bing
W. Lam
288
192
0
02 Oct 2021
TopiOCQA: Open-domain Conversational Question Answering with Topic
  Switching
TopiOCQA: Open-domain Conversational Question Answering with Topic Switching
Vaibhav Adlakha
Shehzaad Dhuliawala
Kaheer Suleman
H. D. Vries
Siva Reddy
BDL
111
92
0
02 Oct 2021
Low Frequency Names Exhibit Bias and Overfitting in Contextualizing
  Language Models
Low Frequency Names Exhibit Bias and Overfitting in Contextualizing Language Models
Robert Wolfe
Aylin Caliskan
125
51
0
01 Oct 2021
Semi-Supervised Text Classification via Self-Pretraining
Semi-Supervised Text Classification via Self-Pretraining
Payam Karisani
Negin Karisani
SSLVLM
65
22
0
30 Sep 2021
SlovakBERT: Slovak Masked Language Model
SlovakBERT: Slovak Masked Language Model
Matúš Pikuliak
Stefan Grivalsky
Martin Konopka
Miroslav Blšták
Martin Tamajka
Viktor Bachratý
Marian Simko
Pavol Balázik
Michal Trnka
Filip Uhlárik
66
27
0
30 Sep 2021
Compositional generalization in semantic parsing with pretrained
  transformers
Compositional generalization in semantic parsing with pretrained transformers
A. Orhan
76
6
0
30 Sep 2021
Fine-tuning wav2vec2 for speaker recognition
Fine-tuning wav2vec2 for speaker recognition
Nik Vaessen
David A. van Leeuwen
116
108
0
30 Sep 2021
BERT got a Date: Introducing Transformers to Temporal Tagging
Satya Almasian
Dennis Aumiller
Michael Gertz
67
15
0
30 Sep 2021
Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System
Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System
Yixuan Su
Lei Shu
Elman Mansimov
Arshit Gupta
Deng Cai
Yi-An Lai
Yi Zhang
226
195
0
29 Sep 2021
Single-dataset Experts for Multi-dataset Question Answering
Single-dataset Experts for Multi-dataset Question Answering
Dan Friedman
Ben Dodge
Danqi Chen
MoMe
180
26
0
28 Sep 2021
FQuAD2.0: French Question Answering and knowing that you know nothing
FQuAD2.0: French Question Answering and knowing that you know nothing
Quentin Heinrich
Gautier Viaud
Wacim Belblidia
66
8
0
27 Sep 2021
Multiplicative Position-aware Transformer Models for Language
  Understanding
Multiplicative Position-aware Transformer Models for Language Understanding
Zhiheng Huang
Davis Liang
Peng Xu
Bing Xiang
27
1
0
27 Sep 2021
Joint Multimedia Event Extraction from Video and Article
Joint Multimedia Event Extraction from Video and Article
Brian Chen
Xudong Lin
Christopher Thomas
Manling Li
Shoya Yoshida
Lovish Chum
Heng Ji
Shih-Fu Chang
VGen
83
26
0
27 Sep 2021
Paradigm Shift in Natural Language Processing
Paradigm Shift in Natural Language Processing
Tianxiang Sun
Xiangyang Liu
Xipeng Qiu
Xuanjing Huang
222
82
0
26 Sep 2021
More Than Reading Comprehension: A Survey on Datasets and Metrics of
  Textual Question Answering
More Than Reading Comprehension: A Survey on Datasets and Metrics of Textual Question Answering
Yang Bai
D. Wang
164
10
0
25 Sep 2021
Beyond Distillation: Task-level Mixture-of-Experts for Efficient
  Inference
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Sneha Kudugunta
Yanping Huang
Ankur Bapna
M. Krikun
Dmitry Lepikhin
Minh-Thang Luong
Orhan Firat
MoE
257
112
0
24 Sep 2021
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
Yuan Yao
Ao Zhang
Zhengyan Zhang
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
MLLMVPVLMVLM
300
224
0
24 Sep 2021
Document Automation Architectures and Technologies: A Survey
Document Automation Architectures and Technologies: A Survey
Mohammad Ahmadi Achachlouei
Omkar Patil
Tarun Joshi
V. Nair
AI4CE
40
6
0
23 Sep 2021
Finding a Balanced Degree of Automation for Summary Evaluation
Finding a Balanced Degree of Automation for Summary Evaluation
Shiyue Zhang
Joey Tianyi Zhou
115
44
0
23 Sep 2021
Automated Fact-Checking: A Survey
Automated Fact-Checking: A Survey
Xia Zeng
Amani S. Abumansour
A. Zubiaga
HILM
268
96
0
23 Sep 2021
Zero-Shot Information Extraction as a Unified Text-to-Triple Translation
Zero-Shot Information Extraction as a Unified Text-to-Triple Translation
Chenguang Wang
Xiao Liu
Zui Chen
Haoyun Hong
Jie Tang
Basel Alomair
239
36
0
23 Sep 2021
Recursively Summarizing Books with Human Feedback
Recursively Summarizing Books with Human Feedback
Jeff Wu
Long Ouyang
Daniel M. Ziegler
Nissan Stiennon
Ryan J. Lowe
Jan Leike
Paul Christiano
ALM
227
303
0
22 Sep 2021
Pix2seq: A Language Modeling Framework for Object Detection
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLMViTVLM
298
351
0
22 Sep 2021
Small-Bench NLP: Benchmark for small single GPU trained models in
  Natural Language Processing
Small-Bench NLP: Benchmark for small single GPU trained models in Natural Language Processing
K. Kanakarajan
Bhuvana Kundumani
Malaikannan Sankarasubbu
ALMMoE
62
5
0
22 Sep 2021
Scale Efficiently: Insights from Pre-training and Fine-tuning
  Transformers
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Yi Tay
Mostafa Dehghani
J. Rao
W. Fedus
Samira Abnar
Hyung Won Chung
Sharan Narang
Dani Yogatama
Ashish Vaswani
Donald Metzler
285
115
0
22 Sep 2021
MiRANews: Dataset and Benchmarks for Multi-Resource-Assisted News
  Summarization
MiRANews: Dataset and Benchmarks for Multi-Resource-Assisted News Summarization
Xinnuo Xu
Ondrej Dusek
Shashi Narayan
Verena Rieser
Ioannis Konstas
HILM
55
6
0
22 Sep 2021
Enriching and Controlling Global Semantics for Text Summarization
Enriching and Controlling Global Semantics for Text Summarization
Thong Nguyen
Anh Tuan Luu
Truc Lu
Tho Quan
48
35
0
22 Sep 2021
Previous
123...174175176...196197198
Next