ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,959 papers shown
Title
ConGraT: Self-Supervised Contrastive Pretraining for Joint Graph and
  Text Embeddings
ConGraT: Self-Supervised Contrastive Pretraining for Joint Graph and Text Embeddings
William Brannon
Wonjune Kang
S. Fulay
Hang Jiang
Brandon Roy
Dwaipayan Roy
Jad Kabbara
SSL
83
23
0
23 May 2023
Debiasing should be Good and Bad: Measuring the Consistency of Debiasing
  Techniques in Language Models
Debiasing should be Good and Bad: Measuring the Consistency of Debiasing Techniques in Language Models
Robert D Morabito
Jad Kabbara
Ali Emami
54
7
0
23 May 2023
Training Transitive and Commutative Multimodal Transformers with LoReTTa
Training Transitive and Commutative Multimodal Transformers with LoReTTa
Manuel Tran
Yashin Dicente Cid
Amal Lahiani
Fabian J. Theis
Tingying Peng
Eldad Klaiman
87
2
0
23 May 2023
mmT5: Modular Multilingual Pre-Training Solves Source Language
  Hallucinations
mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations
Jonas Pfeiffer
Francesco Piccinno
Massimo Nicosia
Xinyi Wang
Machel Reid
Sebastian Ruder
VLMLRM
106
31
0
23 May 2023
CompoundPiece: Evaluating and Improving Decompounding Performance of
  Language Models
CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models
Benjamin Minixhofer
Jonas Pfeiffer
Ivan Vulić
81
7
0
23 May 2023
Discrete Prompt Optimization via Constrained Generation for Zero-shot
  Re-ranker
Discrete Prompt Optimization via Constrained Generation for Zero-shot Re-ranker
Sukmin Cho
Soyeong Jeong
Jeongyeon Seo
Jong C. Park
OffRL
125
25
0
23 May 2023
Continual Dialogue State Tracking via Example-Guided Question Answering
Continual Dialogue State Tracking via Example-Guided Question Answering
Hyundong Justin Cho
Andrea Madotto
Zhaojiang Lin
Khyathi Chandu
Satwik Kottur
Jing Xu
Jonathan May
Chinnadhurai Sankar
CLL
83
3
0
23 May 2023
Using Textual Interface to Align External Knowledge for End-to-End
  Task-Oriented Dialogue Systems
Using Textual Interface to Align External Knowledge for End-to-End Task-Oriented Dialogue Systems
Qingyang Wu
Deema Alnuhait
Derek Chen
Zhou Yu
118
5
0
23 May 2023
Exploring Large Language Models for Classical Philology
Exploring Large Language Models for Classical Philology
Frederick Riemenschneider
Anette Frank
70
16
0
23 May 2023
IdEALS: Idiomatic Expressions for Advancement of Language Skills
IdEALS: Idiomatic Expressions for Advancement of Language Skills
Narutatsu Ri
Bill Sun
Sam Davidson
Zhou Yu
71
0
0
23 May 2023
SPEECH: Structured Prediction with Energy-Based Event-Centric
  Hyperspheres
SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres
Shumin Deng
Shengyu Mao
Ningyu Zhang
Bryan Hooi
66
5
0
23 May 2023
Understanding Programs by Exploiting (Fuzzing) Test Cases
Understanding Programs by Exploiting (Fuzzing) Test Cases
Jianyu Zhao
Yuyang Rong
Yiwen Guo
Yifeng He
Hao Chen
114
17
0
23 May 2023
APPLS: Evaluating Evaluation Metrics for Plain Language Summarization
APPLS: Evaluating Evaluation Metrics for Plain Language Summarization
Yue Guo
Tal August
Gondy Leroy
T. Cohen
Lucy Lu Wang
186
9
0
23 May 2023
Small Language Models Improve Giants by Rewriting Their Outputs
Small Language Models Improve Giants by Rewriting Their Outputs
Giorgos Vernikos
Arthur Bravzinskas
Jakub Adamek
Jonathan Mallinson
Aliaksei Severyn
Eric Malmi
BDLLRM
96
16
0
22 May 2023
Element-aware Summarization with Large Language Models: Expert-aligned
  Evaluation and Chain-of-Thought Method
Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method
Yiming Wang
Zhuosheng Zhang
Rui Wang
117
88
0
22 May 2023
BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for
  Real-World Pharmacovigilance
BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance
Karel DÓosterlinck
François Remy
Johannes Deleu
Thomas Demeester
Chris Develder
Klim Zaporojets
Aneiss Ghodsi
Simon Ellershaw
Jack R. Collins
Christopher Potts
108
11
0
22 May 2023
DiffusionNER: Boundary Diffusion for Named Entity Recognition
DiffusionNER: Boundary Diffusion for Named Entity Recognition
Yongliang Shen
Kaitao Song
Xuejiao Tan
Dongsheng Li
Weiming Lu
Yueting Zhuang
DiffM
114
38
0
22 May 2023
Investigating the Role of Feed-Forward Networks in Transformers Using
  Parallel Attention and Feed-Forward Net Design
Investigating the Role of Feed-Forward Networks in Transformers Using Parallel Attention and Feed-Forward Net Design
Shashank Sonkar
Richard G. Baraniuk
59
4
0
22 May 2023
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis
Fuzhao Xue
Yao Fu
Wangchunshu Zhou
Zangwei Zheng
Yang You
155
86
0
22 May 2023
Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A
  Preliminary Study on Writing Assistance
Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance
Yue Zhang
Leyang Cui
Deng Cai
Xinting Huang
Tao Fang
Wei Bi
ALM
98
36
0
22 May 2023
SEAHORSE: A Multilingual, Multifaceted Dataset for Summarization
  Evaluation
SEAHORSE: A Multilingual, Multifaceted Dataset for Summarization Evaluation
Elizabeth Clark
Shruti Rijhwani
Sebastian Gehrmann
Joshua Maynez
Roee Aharoni
Vitaly Nikolaev
Thibault Sellam
Aditya Siddhant
Dipanjan Das
Ankur P. Parikh
97
41
0
22 May 2023
Editing Large Language Models: Problems, Methods, and Opportunities
Editing Large Language Models: Problems, Methods, and Opportunities
Yunzhi Yao
Peng Wang
Bo Tian
Shuyang Cheng
Zhoubo Li
Shumin Deng
Huajun Chen
Ningyu Zhang
KELM
126
314
0
22 May 2023
A Pretrainer's Guide to Training Data: Measuring the Effects of Data
  Age, Domain Coverage, Quality, & Toxicity
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Shayne Longpre
Gregory Yauney
Emily Reif
Katherine Lee
Adam Roberts
...
Denny Zhou
Jason W. Wei
Kevin Robinson
David M. Mimno
Daphne Ippolito
125
168
0
22 May 2023
Machine-Created Universal Language for Cross-lingual Transfer
Machine-Created Universal Language for Cross-lingual Transfer
Yaobo Liang
Quanzhi Zhu
Junhe Zhao
Nan Duan
84
7
0
22 May 2023
Friendly Neighbors: Contextualized Sequence-to-Sequence Link Prediction
Friendly Neighbors: Contextualized Sequence-to-Sequence Link Prediction
Adrian Kochsiek
Apoorv Saxena
Inderjeet Nair
Rainer Gemulla
94
10
0
22 May 2023
MaNtLE: Model-agnostic Natural Language Explainer
MaNtLE: Model-agnostic Natural Language Explainer
Rakesh R Menon
Kerem Zaman
Shashank Srivastava
FAttLRM
85
2
0
22 May 2023
GPT-SW3: An Autoregressive Language Model for the Nordic Languages
GPT-SW3: An Autoregressive Language Model for the Nordic Languages
Ariel Ekgren
Amaru Cuba Gyllensten
Felix Stollenwerk
Joey Öhman
T. Isbister
Evangelia Gogoulou
F. Carlsson
Alice Heiman
Judit Casademont
Magnus Sahlgren
88
13
0
22 May 2023
Lion: Adversarial Distillation of Proprietary Large Language Models
Lion: Adversarial Distillation of Proprietary Large Language Models
Yuxin Jiang
Chunkit Chan
Yin Hua
Wei Wang
ALM
110
25
0
22 May 2023
Kanbun-LM: Reading and Translating Classical Chinese in Japanese Methods
  by Language Models
Kanbun-LM: Reading and Translating Classical Chinese in Japanese Methods by Language Models
Hao Wang
Hirofumi Shimizu
Daisuke Kawahara
77
1
0
22 May 2023
Enhancing Small Medical Learners with Privacy-preserving Contextual
  Prompting
Enhancing Small Medical Learners with Privacy-preserving Contextual Prompting
Xinlu Zhang
Shiyang Li
Xianjun Yang
Chenxin Tian
Yao Qin
Linda R. Petzold
133
9
0
22 May 2023
Model Analysis & Evaluation for Ambiguous Question Answering
Model Analysis & Evaluation for Ambiguous Question Answering
Konstantinos Papakostas
Irene Papadopoulou
ELM
30
1
0
21 May 2023
Teaching the Pre-trained Model to Generate Simple Texts for Text
  Simplification
Teaching the Pre-trained Model to Generate Simple Texts for Text Simplification
Renliang Sun
Wei Xu
Xiaojun Wan
CLL
98
19
0
21 May 2023
i-Code V2: An Autoregressive Generation Framework over Vision, Language,
  and Speech Data
i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Ziyi Yang
Mahmoud Khademi
Yichong Xu
Reid Pryzant
Yuwei Fang
...
Yu Shi
Lu Yuan
Takuya Yoshioka
Michael Zeng
Xuedong Huang
70
2
0
21 May 2023
Revisiting the Architectures like Pointer Networks to Efficiently
  Improve the Next Word Distribution, Summarization Factuality, and Beyond
Revisiting the Architectures like Pointer Networks to Efficiently Improve the Next Word Distribution, Summarization Factuality, and Beyond
Haw-Shiuan Chang
Zonghai Yao
Alolika Gon
Hong-ye Yu
Andrew McCallum
105
11
0
20 May 2023
Learning to Compose Representations of Different Encoder Layers towards
  Improving Compositional Generalization
Learning to Compose Representations of Different Encoder Layers towards Improving Compositional Generalization
Lei Lin
Shuangtao Li
Yafang Zheng
Biao Fu
Shantao Liu
Yidong Chen
Xiaodon Shi
CoGe
94
3
0
20 May 2023
Controlling the Extraction of Memorized Data from Large Language Models
  via Prompt-Tuning
Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning
Mustafa Safa Ozdayi
Charith Peris
Jack G. M. FitzGerald
Christophe Dupuy
Jimit Majmudar
Haidar Khan
Rahil Parikh
Rahul Gupta
70
34
0
19 May 2023
DiffuSIA: A Spiral Interaction Architecture for Encoder-Decoder Text
  Diffusion
DiffuSIA: A Spiral Interaction Architecture for Encoder-Decoder Text Diffusion
Chao-Hong Tan
Jia-Chen Gu
Zhen-Hua Ling
DiffM
69
1
0
19 May 2023
Self-Agreement: A Framework for Fine-tuning Language Models to Find
  Agreement among Diverse Opinions
Self-Agreement: A Framework for Fine-tuning Language Models to Find Agreement among Diverse Opinions
Shiyao Ding
Takayuki Ito
SyDa
35
7
0
19 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through
  the Lens of Verification and Validation
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
134
96
0
19 May 2023
Differentially Private Adapters for Parameter Efficient Acoustic
  Modeling
Differentially Private Adapters for Parameter Efficient Acoustic Modeling
Chun-Wei Ho
Chao-Han Huck Yang
Sabato Marco Siniscalchi
108
1
0
19 May 2023
SlotDiffusion: Object-Centric Generative Modeling with Diffusion Models
SlotDiffusion: Object-Centric Generative Modeling with Diffusion Models
Ziyi Wu
Jingyu Hu
Wuyue Lu
Igor Gilitschenski
Animesh Garg
DiffMOCL
134
47
0
18 May 2023
UniControl: A Unified Diffusion Model for Controllable Visual Generation
  In the Wild
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild
Can Qin
Shu Zhen Zhang
Ning Yu
Yihao Feng
Xinyi Yang
...
Caiming Xiong
Silvio Savarese
Stefano Ermon
Yun Fu
Ran Xu
119
136
0
18 May 2023
mLongT5: A Multilingual and Efficient Text-To-Text Transformer for
  Longer Sequences
mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences
David C. Uthus
Santiago Ontañón
Joshua Ainslie
Mandy Guo
VLM
55
12
0
18 May 2023
Learning In-context Learning for Named Entity Recognition
Learning In-context Learning for Named Entity Recognition
Jiawei Chen
Yaojie Lu
Hongyu Lin
Jie Lou
Wei Jia
Dai Dai
Hua Wu
Boxi Cao
Xianpei Han
Le Sun
NAI
124
22
0
18 May 2023
ProgSG: Cross-Modality Representation Learning for Programs in Electronic Design Automation
Yunsheng Bai
Atefeh Sohrabizadeh
Zongyue Qin
Ziniu Hu
Yizhou Sun
Jason Cong
117
1
0
18 May 2023
Ahead-of-Time P-Tuning
Ahead-of-Time P-Tuning
Daniil Gavrilov
Nikita Balagansky
56
1
0
18 May 2023
Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM
  Inference with Transferable Prompt
Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt
Zhaozhuo Xu
Zirui Liu
Beidi Chen
Yuxin Tang
Jue Wang
Kaixiong Zhou
Helen Zhou
Anshumali Shrivastava
MQ
102
32
0
17 May 2023
What You See is What You Read? Improving Text-Image Alignment Evaluation
What You See is What You Read? Improving Text-Image Alignment Evaluation
Michal Yarom
Yonatan Bitton
Soravit Changpinyo
Roee Aharoni
Jonathan Herzig
Oran Lang
E. Ofek
Idan Szpektor
EGVM
169
85
0
17 May 2023
Towards More Robust NLP System Evaluation: Handling Missing Scores in
  Benchmarks
Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks
Anas Himmi
Ekhine Irurozki
Nathan Noiry
Stephan Clémençon
Pierre Colombo
202
9
0
17 May 2023
Stop Uploading Test Data in Plain Text: Practical Strategies for
  Mitigating Data Contamination by Evaluation Benchmarks
Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks
Alon Jacovi
Avi Caciularu
Omer Goldman
Yoav Goldberg
90
107
0
17 May 2023
Previous
123...131132133...198199200
Next