ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,870 papers shown
Title
REKnow: Enhanced Knowledge for Joint Entity and Relation Extraction
REKnow: Enhanced Knowledge for Joint Entity and Relation Extraction
Sheng Zhang
Patrick Ng
Zhiguo Wang
Bing Xiang
63
5
0
10 Jun 2022
Merak: An Efficient Distributed DNN Training Framework with Automated 3D
  Parallelism for Giant Foundation Models
Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models
Zhiquan Lai
Shengwei Li
Xudong Tang
Ke-shi Ge
Weijie Liu
Yabo Duan
Linbo Qiao
Dongsheng Li
93
46
0
10 Jun 2022
On Data Scaling in Masked Image Modeling
On Data Scaling in Masked Image Modeling
Zhenda Xie
Zheng Zhang
Yue Cao
Yutong Lin
Yixuan Wei
Qi Dai
Han Hu
100
57
0
09 Jun 2022
Revisiting End-to-End Speech-to-Text Translation From Scratch
Revisiting End-to-End Speech-to-Text Translation From Scratch
Biao Zhang
Barry Haddow
Rico Sennrich
81
39
0
09 Jun 2022
TwiBot-22: Towards Graph-Based Twitter Bot Detection
TwiBot-22: Towards Graph-Based Twitter Bot Detection
Shangbin Feng
Zhaoxuan Tan
Herun Wan
Ningnan Wang
Zilong Chen
...
Yanbo Wang
Lijing Zheng
Zihan Ma
Jundong Li
Minnan Luo
123
94
0
09 Jun 2022
Few-shot Question Generation for Personalized Feedback in Intelligent
  Tutoring Systems
Few-shot Question Generation for Personalized Feedback in Intelligent Tutoring Systems
Devang Kulshreshtha
Muhammad Shayan
Robert Belfer
Siva Reddy
Iulian Serban
E. Kochmar
51
11
0
08 Jun 2022
Abstraction not Memory: BERT and the English Article System
Abstraction not Memory: BERT and the English Article System
Harish Tayyar Madabushi
Dagmar Divjak
P. Milin
26
5
0
08 Jun 2022
STable: Table Generation Framework for Encoder-Decoder Models
STable: Table Generation Framework for Encoder-Decoder Models
Michal Pietruszka
M. Turski
Łukasz Borchmann
Tomasz Dwojak
Gabriela Pałka
Karolina Szyndler
Dawid Jurkiewicz
Lukasz Garncarek
LMTD
87
18
0
08 Jun 2022
Neural Collapse: A Review on Modelling Principles and Generalization
Neural Collapse: A Review on Modelling Principles and Generalization
Vignesh Kothapalli
158
82
0
08 Jun 2022
Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning
Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning
Momin Abbas
Quan-Wu Xiao
Lisha Chen
Pin-Yu Chen
Tianyi Chen
111
84
0
08 Jun 2022
Counseling Summarization using Mental Health Knowledge Guided Utterance
  Filtering
Counseling Summarization using Mental Health Knowledge Guided Utterance Filtering
Aseem Srivastava
Tharun Suresh
Sarah Peregrine
S. P. Lord
Md. Shad Akhtar
Tanmoy Chakraborty
69
14
0
08 Jun 2022
Multi-channel neural networks for predicting influenza A virus hosts and
  antigenic types
Multi-channel neural networks for predicting influenza A virus hosts and antigenic types
Yanhua Xu
D. Wojtczak
BDL
22
0
0
08 Jun 2022
No Parameter Left Behind: How Distillation and Model Size Affect
  Zero-Shot Retrieval
No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval
G. Rosa
L. Bonifacio
Vitor Jeronymo
Hugo Queiroz Abonizio
Marzieh Fadaee
R. Lotufo
Rodrigo Nogueira
101
26
0
06 Jun 2022
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture
  of Experts
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts
Basil Mustafa
C. Riquelme
J. Puigcerver
Rodolphe Jenatton
N. Houlsby
VLMMoE
170
205
0
06 Jun 2022
Dual Decomposition of Convex Optimization Layers for Consistent
  Attention in Medical Images
Dual Decomposition of Convex Optimization Layers for Consistent Attention in Medical Images
Tom Ron
M. Weiler-Sagie
Tamir Hazan
FAttMedIm
81
6
0
06 Jun 2022
Curriculum-Based Self-Training Makes Better Few-Shot Learners for
  Data-to-Text Generation
Curriculum-Based Self-Training Makes Better Few-Shot Learners for Data-to-Text Generation
Pei Ke
Haozhe Ji
Zhenyu Yang
Yi Huang
Junlan Feng
Xiaoyan Zhu
Minlie Huang
61
6
0
06 Jun 2022
Learning to Ask Like a Physician
Learning to Ask Like a Physician
Eric P. Lehman
Vladislav Lialin
K. Y. Legaspi
Anne Janelle R. Sy
Patricia Therese S. Pile
...
Anna Rumshisky
Jenifer Liang
Preethi Raghavan
Leo Anthony Celi
Peter Szolovits
OOD
80
20
0
06 Jun 2022
Pretrained Models for Multilingual Federated Learning
Pretrained Models for Multilingual Federated Learning
Orion Weller
Marc Marone
Vladimir Braverman
Dawn J Lawrie
Benjamin Van Durme
VLMFedMLAI4CE
94
42
0
06 Jun 2022
Exploring Cross-lingual Textual Style Transfer with Large Multilingual
  Language Models
Exploring Cross-lingual Textual Style Transfer with Large Multilingual Language Models
Daniil Moskovskiy
Daryna Dementieva
Alexander Panchenko
63
3
0
05 Jun 2022
Towards Fast Adaptation of Pretrained Contrastive Models for
  Multi-channel Video-Language Retrieval
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval
Xudong Lin
Simran Tiwari
Shiyuan Huang
Manling Li
Mike Zheng Shou
Heng Ji
Shih-Fu Chang
127
21
0
05 Jun 2022
Formal Specifications from Natural Language
Formal Specifications from Natural Language
Christopher Hahn
Frederik Schmitt
Julia J. Tillman
Niklas Metzger
Julian Siber
Bernd Finkbeiner
102
29
0
04 Jun 2022
ZeroQuant: Efficient and Affordable Post-Training Quantization for
  Large-Scale Transformers
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
Z. Yao
Reza Yazdani Aminabadi
Minjia Zhang
Xiaoxia Wu
Conglong Li
Yuxiong He
VLMMQ
174
484
0
04 Jun 2022
A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge
A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge
Dustin Schwenk
Apoorv Khandelwal
Christopher Clark
Kenneth Marino
Roozbeh Mottaghi
74
556
0
03 Jun 2022
Acquiring and Modelling Abstract Commonsense Knowledge via
  Conceptualization
Acquiring and Modelling Abstract Commonsense Knowledge via Conceptualization
Mutian He
Tianqing Fang
Weiqi Wang
Yangqiu Song
94
30
0
03 Jun 2022
MMTM: Multi-Tasking Multi-Decoder Transformer for Math Word Problems
MMTM: Multi-Tasking Multi-Decoder Transformer for Math Word Problems
Keyur Faldu
Amit P. Sheth
Prashant Kikani
Darshan Patel
AIMat
52
1
0
02 Jun 2022
REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual
  Question Answering
REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering
Yuanze Lin
Yujia Xie
Dongdong Chen
Yichong Xu
Chenguang Zhu
Lu Yuan
88
75
0
02 Jun 2022
Learning code summarization from a small and local dataset
Learning code summarization from a small and local dataset
Toufique Ahmed
Prem Devanbu
79
10
0
02 Jun 2022
Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal
  Pre-training
Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training
Yan Zeng
Wangchunshu Zhou
Ao Luo
Ziming Cheng
Xinsong Zhang
VLM
95
32
0
01 Jun 2022
Natural Language Sentence Generation from API Specifications
Natural Language Sentence Generation from API Specifications
Siyu Huo
K. Mukherjee
Jayachandu Bandlamudi
Vatche Isahagian
Vinod Muthusamy
Sadhana Kumaravel
59
2
0
01 Jun 2022
THE-X: Privacy-Preserving Transformer Inference with Homomorphic
  Encryption
THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption
Tianyu Chen
Hangbo Bao
Shaohan Huang
Li Dong
Binxing Jiao
Daxin Jiang
Haoyi Zhou
Jianxin Li
Furu Wei
98
107
0
01 Jun 2022
On the Usefulness of Embeddings, Clusters and Strings for Text Generator
  Evaluation
On the Usefulness of Embeddings, Clusters and Strings for Text Generator Evaluation
Tiago Pimentel
Clara Meister
Ryan Cotterell
128
7
0
31 May 2022
NEWTS: A Corpus for News Topic-Focused Summarization
NEWTS: A Corpus for News Topic-Focused Summarization
Seyed Ali Bahrainian
Sheridan Feucht
Carsten Eickhoff
119
26
0
31 May 2022
Prompt Injection: Parameterization of Fixed Inputs
Prompt Injection: Parameterization of Fixed Inputs
Eunbi Choi
Yongrae Jo
Joel Jang
Minjoon Seo
119
30
0
31 May 2022
Leveraging Pre-Trained Language Models to Streamline Natural Language
  Interaction for Self-Tracking
Leveraging Pre-Trained Language Models to Streamline Natural Language Interaction for Self-Tracking
Young-Ho Kim
Sungdong Kim
Minsuk Chang
Sang-Woo Lee
96
5
0
31 May 2022
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
Wangchunshu Zhou
Yan Zeng
Shizhe Diao
Xinsong Zhang
CoGeVLM
97
13
0
30 May 2022
Billions of Parameters Are Worth More Than In-domain Training Data: A
  case study in the Legal Case Entailment Task
Billions of Parameters Are Worth More Than In-domain Training Data: A case study in the Legal Case Entailment Task
G. Rosa
L. Bonifacio
Vitor Jeronymo
Hugo Queiroz Abonizio
R. Lotufo
Rodrigo Nogueira
AILawELM
100
11
0
30 May 2022
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language
  Understanding and Generation
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
114
27
0
30 May 2022
Learning Locality and Isotropy in Dialogue Modeling
Learning Locality and Isotropy in Dialogue Modeling
Han Wu
Hao Hao Tan
Mingjie Zhan
Gangming Zhao
Shaoqing Lu
Ding Liang
Linqi Song
81
2
0
29 May 2022
Gating Dropout: Communication-efficient Regularization for Sparsely
  Activated Transformers
Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers
R. Liu
Young Jin Kim
Alexandre Muzio
Hany Awadalla
MoE
81
22
0
28 May 2022
Multimodal Fake News Detection via CLIP-Guided Learning
Multimodal Fake News Detection via CLIP-Guided Learning
Yangming Zhou
Qichao Ying
Zhenxing Qian
Sheng Li
Xinpeng Zhang
96
60
0
28 May 2022
Few-shot Subgoal Planning with Language Models
Few-shot Subgoal Planning with Language Models
Lajanugen Logeswaran
Yao Fu
Moontae Lee
Honglak Lee
LRM
76
26
0
28 May 2022
Controllable Text Generation with Neurally-Decomposed Oracle
Controllable Text Generation with Neurally-Decomposed Oracle
Tao Meng
Sidi Lu
Nanyun Peng
Kai-Wei Chang
BDL
103
37
0
27 May 2022
TURJUMAN: A Public Toolkit for Neural Arabic Machine Translation
TURJUMAN: A Public Toolkit for Neural Arabic Machine Translation
El Moatez Billah Nagoudi
AbdelRahim Elmadany
Muhammad Abdul-Mageed
40
14
0
27 May 2022
CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model
  Behavior
CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior
Eldar David Abraham
Karel DÓosterlinck
Amir Feder
Y. Gat
Atticus Geiger
Christopher Potts
Roi Reichart
Zhengxuan Wu
CML
122
47
0
27 May 2022
GIT: A Generative Image-to-text Transformer for Vision and Language
GIT: A Generative Image-to-text Transformer for Vision and Language
Jianfeng Wang
Zhengyuan Yang
Xiaowei Hu
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Zicheng Liu
Ce Liu
Lijuan Wang
VLM
172
562
0
27 May 2022
AANG: Automating Auxiliary Learning
AANG: Automating Auxiliary Learning
Lucio Dery
Paul Michel
M. Khodak
Graham Neubig
Ameet Talwalkar
114
9
0
27 May 2022
StereoKG: Data-Driven Knowledge Graph Construction for Cultural
  Knowledge and Stereotypes
StereoKG: Data-Driven Knowledge Graph Construction for Cultural Knowledge and Stereotypes
Awantee V. Deshpande
Dana Ruiter
Marius Mosbach
Dietrich Klakow
42
12
0
27 May 2022
Learning to Automate Follow-up Question Generation using Process
  Knowledge for Depression Triage on Reddit Posts
Learning to Automate Follow-up Question Generation using Process Knowledge for Depression Triage on Reddit Posts
Shrey Gupta
Anmol Agarwal
Manas Gaur
Kaushik Roy
Vignesh Narayanan
Ponnurangam Kumaraguru
Amit P. Sheth
AI4MH
67
34
0
27 May 2022
Clinical Dialogue Transcription Error Correction using Seq2Seq Models
Clinical Dialogue Transcription Error Correction using Seq2Seq Models
Gayani Nanayakkara
Nirmalie Wiratunga
D. Corsar
Kyle Martin
A. Wijekoon
39
1
0
26 May 2022
Your Transformer May Not be as Powerful as You Expect
Your Transformer May Not be as Powerful as You Expect
Shengjie Luo
Shanda Li
Shuxin Zheng
Tie-Yan Liu
Liwei Wang
Di He
139
54
0
26 May 2022
Previous
123...157158159...196197198
Next