Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,923 papers shown
Title
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Aran Komatsuzaki
J. Puigcerver
James Lee-Thorp
Carlos Riquelme Ruiz
Basil Mustafa
Joshua Ainslie
Yi Tay
Mostafa Dehghani
N. Houlsby
MoMe
MoE
108
124
0
09 Dec 2022
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Weixi Feng
Xuehai He
Tsu-Jui Fu
Varun Jampani
Arjun Reddy Akula
P. Narayana
Sugato Basu
Xinze Wang
William Yang Wang
CoGe
200
318
0
09 Dec 2022
TRBLLmaker -- Transformer Reads Between Lyrics Lines maker
Mor Ventura
Michael Toker
22
2
0
09 Dec 2022
Open-world Story Generation with Structured Knowledge Enhancement: A Comprehensive Survey
Yuxin Wang
Jieru Lin
Zhiwei Yu
Wei Hu
Börje F. Karlsson
138
20
0
09 Dec 2022
Mitigation of Spatial Nonstationarity with Vision Transformers
Lei Liu
Javier E. Santos
Mavsa Prodanović
Michael J. Pyrcz
50
4
0
09 Dec 2022
Learning Video Representations from Large Language Models
Yue Zhao
Ishan Misra
Philipp Krahenbuhl
Rohit Girdhar
VLM
AI4TS
118
178
0
08 Dec 2022
OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models
Jinze Bai
Rui Men
Han Yang
Xuancheng Ren
Kai Dang
...
Wenhang Ge
Jianxin Ma
Junyang Lin
Jingren Zhou
Chang Zhou
88
16
0
08 Dec 2022
Momentum Calibration for Text Generation
Xingxing Zhang
Yiran Liu
Xun Wang
Pengcheng He
Yang Yu
Si-Qing Chen
Wayne Xiong
Furu Wei
144
9
0
08 Dec 2022
Harnessing the Power of Multi-Task Pretraining for Ground-Truth Level Natural Language Explanations
Björn Plüster
Jakob Ambsdorf
Lukas Braach
Jae Hee Lee
S. Wermter
78
6
0
08 Dec 2022
Successive Prompting for Decomposing Complex Questions
Dheeru Dua
Shivanshu Gupta
Sameer Singh
Matt Gardner
ReLM
LRM
111
118
0
08 Dec 2022
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
219
523
0
08 Dec 2022
LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models
Chan Hee Song
Jiaman Wu
Clay Washington
Brian M Sadler
Wei-Lun Chao
Yu-Chuan Su
LLMAG
LM&Ro
182
425
0
08 Dec 2022
Investigating Glyph Phonetic Information for Chinese Spell Checking: What Works and What's Next
Xiaotian Zhang
Yanjun Zheng
Hang Yan
Xipeng Qiu
78
5
0
08 Dec 2022
Multimodal Vision Transformers with Forced Attention for Behavior Analysis
Tanay Agrawal
Michal Balazia
Philippe Muller
Franccois Brémond
ViT
86
9
0
07 Dec 2022
Discovering Latent Knowledge in Language Models Without Supervision
Collin Burns
Haotian Ye
Dan Klein
Jacob Steinhardt
166
386
0
07 Dec 2022
Robustness of Learning from Task Instructions
Jiasheng Gu
Hongyu Zhao
Hanzi Xu
Liang Nie
Hongyuan Mei
Wenpeng Yin
OOD
101
34
0
07 Dec 2022
Pre-Training With Scientific Text Improves Educational Question Generation
Hamze Muse
Sahan Bulathwela
Emine Yilmaz
AI4Ed
55
8
0
07 Dec 2022
Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning
Kyuyong Shin
Hanock Kwak
Wonjae Kim
Jisu Jeong
Seungjae Jung
KyungHyun Kim
Jung-Woo Ha
Sang-Woo Lee
84
4
0
07 Dec 2022
Memorization of Named Entities in Fine-tuned BERT Models
Andor Diera
N. Lell
Aygul Garifullina
A. Scherp
68
0
0
07 Dec 2022
Harnessing Knowledge and Reasoning for Human-Like Natural Language Generation: A Brief Review
Jiangjie Chen
Yanghua Xiao
118
5
0
07 Dec 2022
Hierarchical multimodal transformers for Multi-Page DocVQA
Rubèn Pérez Tito
Dimosthenis Karatzas
Ernest Valveny
94
61
0
07 Dec 2022
Text Embeddings by Weakly-Supervised Contrastive Pre-training
Liang Wang
Nan Yang
Xiaolong Huang
Binxing Jiao
Linjun Yang
Daxin Jiang
Rangan Majumder
Furu Wei
VLM
266
624
0
07 Dec 2022
Robust Speech Recognition via Large-Scale Weak Supervision
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
230
3,780
0
06 Dec 2022
Document-Level Abstractive Summarization
Gonçalo Raposo
Afonso Raposo
Ana Sofia Carmo
48
2
0
06 Dec 2022
Controlled Text Generation using T5 based Encoder-Decoder Soft Prompt Tuning and Analysis of the Utility of Generated Text in AI
Damith Chamalke Senadeera
Julia Ive
85
6
0
06 Dec 2022
Towards human-compatible autonomous car: A study of non-verbal Turing test in automated driving with affective transition modelling
Zhaoning Li
Qiaoli Jiang
Zhengming Wu
Anqi Liu
Haiyan Wu
Miner Huang
Kai Huang
Y. Ku
64
2
0
06 Dec 2022
DiSTRICT: Dialogue State Tracking with Retriever Driven In-Context Tuning
Praveen Venkateswaran
Evelyn Duesterwald
Vatche Isahagian
92
9
0
06 Dec 2022
Dist-PU: Positive-Unlabeled Learning from a Label Distribution Perspective
Yunrui Zhao
Qianqian Xu
Yangbangyan Jiang
Peisong Wen
Qingming Huang
71
40
0
06 Dec 2022
UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression
Jiaqi Chen
Tong Li
Jinghui Qin
Pan Lu
Liang Lin
Chongyu Chen
Xiaodan Liang
AIMat
LRM
112
105
0
06 Dec 2022
Vision Transformer Computation and Resilience for Dynamic Inference
Kavya Sreedhar
Jason Clemons
Rangharajan Venkatesan
S. Keckler
M. Horowitz
81
2
0
06 Dec 2022
POQue: Asking Participant-specific Outcome Questions for a Deeper Understanding of Complex Events
Sai Vallurupalli
Sayontan Ghosh
K. Erk
Niranjan Balasubramanian
Francis Ferraro
61
5
0
05 Dec 2022
Unifying Vision, Text, and Layout for Universal Document Processing
Zineng Tang
Ziyi Yang
Guoxin Wang
Yuwei Fang
Yang Liu
Chenguang Zhu
Michael Zeng
Chao-Yue Zhang
Joey Tianyi Zhou
VLM
131
115
0
05 Dec 2022
Images Speak in Images: A Generalist Painter for In-Context Visual Learning
Xinlong Wang
Wen Wang
Yue Cao
Chunhua Shen
Tiejun Huang
VLM
MLLM
166
262
0
05 Dec 2022
Legal Prompt Engineering for Multilingual Legal Judgement Prediction
Dietrich Trautmann
Alina Petrova
Frank Schilder
ELM
AILaw
102
80
0
05 Dec 2022
LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Yuguang Yang
Yu Pan
Jingjing Yin
Heng Lu
103
3
0
05 Dec 2022
Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer
Zhengbao Jiang
Luyu Gao
Jun Araki
Haibo Ding
Zhiruo Wang
Jamie Callan
Graham Neubig
RALM
136
43
0
05 Dec 2022
QBERT: Generalist Model for Processing Questions
Zhaozhen Xu
N. Cristianini
32
1
0
05 Dec 2022
Grounded Keys-to-Text Generation: Towards Factual Open-Ended Generation
Faeze Brahman
Baolin Peng
Michel Galley
Sudha Rao
Bill Dolan
Snigdha Chaturvedi
Jianfeng Gao
HILM
69
5
0
04 Dec 2022
Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE
Qihuang Zhong
Liang Ding
Yibing Zhan
Yu Qiao
Yonggang Wen
...
Yixin Chen
Xinbo Gao
Steven C. H. Hoi
Xiaoou Tang
Dacheng Tao
VLM
ELM
124
35
0
04 Dec 2022
MiLMo:Minority Multilingual Pre-trained Language Model
Sisi Liu
Hanru Shi
Xinhe Yu
Wugedele Bao
Yuan Sun
Xiaobing Zhao
81
0
0
04 Dec 2022
Languages You Know Influence Those You Learn: Impact of Language Characteristics on Multi-Lingual Text-to-Text Transfer
Benjamin Muller
Deepanshu Gupta
Siddharth Patwardhan
J. Fauconnier
David Vandyke
Sachin Agarwal
94
5
0
04 Dec 2022
KPT: Keyword-guided Pre-training for Grounded Dialog Generation
Qi Zhu
Fei Mi
Zheng Zhang
Yasheng Wang
Yitong Li
Xin Jiang
Qun Liu
Xiaoyan Zhu
Minlie Huang
111
5
0
04 Dec 2022
Language Models as Agent Models
Jacob Andreas
LLMAG
90
141
0
03 Dec 2022
A Survey on Medical Document Summarization
Raghav Jain
Anubhav Jangra
S. Saha
Adam Jatowt
3DGS
MedIm
82
19
0
03 Dec 2022
T-STAR: Truthful Style Transfer using AMR Graph as Intermediate Representation
Anubhav Jangra
Preksha Nema
A. Raghuveer
49
7
0
03 Dec 2022
Global memory transformer for processing long documents
Arij Al Adel
50
5
0
03 Dec 2022
CoP: Factual Inconsistency Detection by Controlling the Preference
Shuaijie She
Xiang Geng
Shujian Huang
Jiajun Chen
95
5
0
03 Dec 2022
RHO (
ρ
ρ
ρ
): Reducing Hallucination in Open-domain Dialogues with Knowledge Grounding
Ziwei Ji
Zihan Liu
Nayeon Lee
Tiezheng Yu
Bryan Wilie
Mini Zeng
Pascale Fung
HILM
96
55
0
03 Dec 2022
NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization
Chao Zhao
Faeze Brahman
Kaiqiang Song
Wenlin Yao
Dian Yu
Snigdha Chaturvedi
HILM
95
8
0
02 Dec 2022
Compound Tokens: Channel Fusion for Vision-Language Representation Learning
Maxwell Mbabilla Aladago
A. Piergiovanni
66
2
0
02 Dec 2022
Previous
1
2
3
...
142
143
144
...
197
198
199
Next