Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.11038
Cited By
Muppet: Massive Multi-task Representations with Pre-Finetuning
26 January 2021
Armen Aghajanyan
Anchit Gupta
Akshat Shrivastava
Xilun Chen
Luke Zettlemoyer
Sonal Gupta
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Muppet: Massive Multi-task Representations with Pre-Finetuning"
50 / 171 papers shown
Title
TransCoder: Towards Unified Transferable Code Representation Learning Inspired by Human Skills
Qiushi Sun
Nuo Chen
Jiadong Wang
Xiang Li
Ming Gao
90
8
0
23 May 2023
Continual Dialogue State Tracking via Example-Guided Question Answering
Hyundong Justin Cho
Andrea Madotto
Zhaojiang Lin
Khyathi Chandu
Satwik Kottur
Jing Xu
Jonathan May
Chinnadhurai Sankar
CLL
88
3
0
23 May 2023
TaskWeb: Selecting Better Source Tasks for Multi-task NLP
Joongwon Kim
Akari Asai
Gabriel Ilharco
Hannaneh Hajishirzi
102
12
0
22 May 2023
Soft Prompt Decoding for Multilingual Dense Retrieval
Zhiqi Huang
Hansi Zeng
Hamed Zamani
James Allan
RALM
104
13
0
15 May 2023
Recyclable Tuning for Continual Pre-training
Yujia Qin
Cheng Qian
Xu Han
Yankai Lin
Huadong Wang
Ruobing Xie
Zhiyuan Liu
Maosong Sun
Jie Zhou
CLL
88
13
0
15 May 2023
STORYWARS: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation
Yulun Du
Lydia B. Chilton
90
8
0
14 May 2023
Musketeer: Joint Training for Multi-task Vision Language Model with Task Explanation Prompts
Zhaoyang Zhang
Yantao Shen
Kunyu Shi
Zhaowei Cai
Jun Fang
Siqi Deng
Hao Yang
Davide Modolo
Zhuowen Tu
Stefano Soatto
VLM
83
2
0
11 May 2023
Long-Tailed Question Answering in an Open World
Yinpei Dai
Hao Lang
Yinhe Zheng
Fei Huang
Yongbin Li
VLM
83
9
0
11 May 2023
XTab: Cross-table Pretraining for Tabular Transformers
Bingzhao Zhu
Xingjian Shi
Nick Erickson
Mu Li
George Karypis
Mahsa Shoaran
LMTD
134
78
0
10 May 2023
Best-Effort Adaptation
Pranjal Awasthi
Corinna Cortes
M. Mohri
93
8
0
10 May 2023
VideoOFA: Two-Stage Pre-Training for Video-to-Text Generation
Xilun Chen
L. Yu
Wenhan Xiong
Barlas Ouguz
Yashar Mehdad
Wen-tau Yih
VGen
58
3
0
04 May 2023
Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory
Xin Cheng
Di Luo
Preslav Nakov
Lemao Liu
Dongyan Zhao
Rui Yan
RALM
233
104
0
03 May 2023
π
π
π
-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation
Chengyue Wu
Teng Wang
Yixiao Ge
Zeyu Lu
Rui-Zhi Zhou
Ying Shan
Ping Luo
MoMe
150
37
0
27 Apr 2023
LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity
Anjana Arunkumar
Sanjay Kariyappa
Rakhi Agrawal
Sriramakrishnan Chandrasekaran
Chris Bryan
87
0
0
12 Apr 2023
SheffieldVeraAI at SemEval-2023 Task 3: Mono and multilingual approaches for news genre, topic and persuasion technique classification
Ben Wu
Olesya Razuvayevskaya
Freddy Heppell
João A. Leite
Carolina Scarton
Kalina Bontcheva
Xingyi Song
49
9
0
16 Mar 2023
AUTODIAL: Efficient Asynchronous Task-Oriented Dialogue Model
Prajjwal Bhargava
P. Amini
Shahin Shayandeh
Chinnadhurai Sankar
48
0
0
10 Mar 2023
Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning
Zhen Wang
Yikang Shen
Leonid Karlinsky
Rogerio Feris
Huan Sun
Yoon Kim
VLM
VPVLM
108
118
0
06 Mar 2023
Pre-Finetuning for Few-Shot Emotional Speech Recognition
Maximillian Chen
Zhou Yu
84
4
0
24 Feb 2023
Conversational Text-to-SQL: An Odyssey into State-of-the-Art and Challenges Ahead
S. Parthasarathi
Lu Zeng
Dilek Z. Hakkani-Tür
88
2
0
21 Feb 2023
Privately Customizing Prefinetuning to Better Match User Data in Federated Learning
Charlie Hou
Hongyuan Zhan
Akshat Shrivastava
Sida I. Wang
S. Livshits
Giulia Fanti
Daniel Lazar
FedML
97
16
0
17 Feb 2023
Transformer models: an introduction and catalog
X. Amatriain
Ananth Sankar
Jie Bing
Praveen Kumar Bodigutla
Timothy J. Hazen
Michaeel Kazi
150
53
0
12 Feb 2023
Knowledge is a Region in Weight Space for Fine-tuned Language Models
Almog Gueta
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
90
52
0
09 Feb 2023
CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models
Changan Niu
Chuanyi Li
Vincent Ng
Bin Luo
ELM
ALM
110
9
0
08 Feb 2023
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
Joel Jang
Seungone Kim
Seonghyeon Ye
Doyoung Kim
Lajanugen Logeswaran
Moontae Lee
Kyungjae Lee
Minjoon Seo
LRM
ALM
139
83
0
07 Feb 2023
Curriculum-Guided Abstractive Summarization
Sajad Sotudeh
Hanieh Deilamsalehy
Franck Dernoncourt
Nazli Goharian
102
2
0
02 Feb 2023
Improving Few-Shot Generalization by Exploring and Exploiting Auxiliary Data
Alon Albalak
Colin Raffel
William Yang Wang
105
12
0
01 Feb 2023
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Shayne Longpre
Le Hou
Tu Vu
Albert Webson
Hyung Won Chung
...
Denny Zhou
Quoc V. Le
Barret Zoph
Jason W. Wei
Adam Roberts
ALM
160
678
0
31 Jan 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model
Z. Fu
Anthony Man-Cho So
Nigel Collier
82
3
0
24 Jan 2023
ERNIE 3.0 Tiny: Frustratingly Simple Method to Improve Task-Agnostic Distillation Generalization
Weixin Liu
Xuyi Chen
Jiaxiang Liu
Shi Feng
Yu Sun
Hao Tian
Hua Wu
99
2
0
09 Jan 2023
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning
Zhiyang Xu
Ying Shen
Lifu Huang
MLLM
154
120
0
21 Dec 2022
Socratic Pretraining: Question-Driven Pretraining for Controllable Summarization
Artidoro Pagnoni
Alexander R. Fabbri
Wojciech Kry'sciñski
Chien-Sheng Wu
RALM
131
18
0
20 Dec 2022
MIGA: A Unified Multi-task Generation Framework for Conversational Text-to-SQL
Yingwen Fu
Wenjie Ou
Zhou Yu
Yue Lin
75
7
0
19 Dec 2022
From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Model to Pre-trained Machine Reader
Weiwen Xu
Xin Li
Wenxuan Zhang
Meng Zhou
W. Lam
Luo Si
Lidong Bing
90
2
0
09 Dec 2022
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
109
55
0
02 Dec 2022
SciRepEval: A Multi-Format Benchmark for Scientific Document Representations
Amanpreet Singh
Mike DÁrcy
Arman Cohan
Doug Downey
Sergey Feldman
129
93
0
23 Nov 2022
TCBERT: A Technical Report for Chinese Topic Classification BERT
Ting Han
Kunhao Pan
Xinyu Chen
Dingjie Song
Yuchen Fan
Xinyu Gao
Ruyi Gan
Jiaxing Zhang
VLM
67
1
0
21 Nov 2022
Zero-Shot Classification by Logical Reasoning on Natural Language Explanations
Chi Han
Hengzhi Pei
Xinya Du
Heng Ji
NAI
94
3
0
07 Nov 2022
Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate Speech Detection
Jiyun Kim
Byounghan Lee
Kyung-ah Sohn
78
14
0
01 Nov 2022
Where to start? Analyzing the potential value of intermediate models
Leshem Choshen
Elad Venezian
Shachar Don-Yehiya
Noam Slonim
Yoav Katz
MoMe
132
27
0
31 Oct 2022
Zero-Shot Text Classification with Self-Training
Ariel Gera
Alon Halfon
Eyal Shnarch
Yotam Perlitz
L. Ein-Dor
Noam Slonim
VLM
91
62
0
31 Oct 2022
Lila: A Unified Benchmark for Mathematical Reasoning
Swaroop Mishra
Matthew Finlayson
Pan Lu
Leonard Tang
Sean Welleck
...
Tanmay Rajpurohit
Oyvind Tafjord
Ashish Sabharwal
Peter Clark
Ashwin Kalyan
ELM
AIMat
ReLM
LRM
114
0
0
31 Oct 2022
Analyzing Multi-Task Learning for Abstractive Text Summarization
Frederic Kirstein
Jan Philip Wahle
Terry Ruas
Bela Gipp
85
4
0
26 Oct 2022
Exploring Mode Connectivity for Pre-trained Language Models
Yujia Qin
Cheng Qian
Jing Yi
Weize Chen
Yankai Lin
Xu Han
Zhiyuan Liu
Maosong Sun
Jie Zhou
99
21
0
25 Oct 2022
Visualizing the Obvious: A Concreteness-based Ensemble Model for Noun Property Prediction
Yue Yang
Artemis Panagopoulou
Marianna Apidianaki
Mark Yatskar
Chris Callison-Burch
113
2
0
24 Oct 2022
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
433
3,180
0
20 Oct 2022
Transcending Scaling Laws with 0.1% Extra Compute
Yi Tay
Jason W. Wei
Hyung Won Chung
Vinh Q. Tran
David R. So
...
Donald Metzler
Slav Petrov
N. Houlsby
Quoc V. Le
Mostafa Dehghani
LRM
124
71
0
20 Oct 2022
Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective
Ping Yang
Junjie Wang
Ruyi Gan
Xinyu Zhu
Lin Zhang
Ziwei Wu
Xinyu Gao
Jiaxing Zhang
Tetsuya Sakai
BDL
73
27
0
16 Oct 2022
Task Compass: Scaling Multi-task Pre-training with Task Prefix
Zhuosheng Zhang
Shuohang Wang
Yichong Xu
Yuwei Fang
Wenhao Yu
Yang Liu
Han Zhao
Chenguang Zhu
Michael Zeng
SSL
LRM
82
16
0
12 Oct 2022
Data-Efficiency with a Single GPU: An Exploration of Transfer Methods for Small Language Models
Alon Albalak
Akshat Shrivastava
Chinnadhurai Sankar
Adithya Sagar
Mike Ross
84
3
0
08 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
307
100
0
06 Oct 2022
Previous
1
2
3
4
Next