ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2101.11038
  4. Cited By
Muppet: Massive Multi-task Representations with Pre-Finetuning

Muppet: Massive Multi-task Representations with Pre-Finetuning

26 January 2021
Armen Aghajanyan
Anchit Gupta
Akshat Shrivastava
Xilun Chen
Luke Zettlemoyer
Sonal Gupta
ArXiv (abs)PDFHTML

Papers citing "Muppet: Massive Multi-task Representations with Pre-Finetuning"

50 / 171 papers shown
Title
Subasa -- Adapting Language Models for Low-resourced Offensive Language Detection in Sinhala
Subasa -- Adapting Language Models for Low-resourced Offensive Language Detection in Sinhala
Shanilka Haturusinghe
Tharindu Cyril Weerasooriya
Marcos Zampieri
Christopher Homan
S. Liyanage
83
0
0
02 Apr 2025
Efficient Model Development through Fine-tuning Transfer
Efficient Model Development through Fine-tuning Transfer
Pin-Jie Lin
Rishab Balasubramanian
Fengyuan Liu
Nikhil Kandpal
Tu Vu
204
2
0
25 Mar 2025
On the Role of Pre-trained Embeddings in Binary Code Analysis
On the Role of Pre-trained Embeddings in Binary Code Analysis
Alwin Maier
Felix Weissberg
Konrad Rieck
140
0
0
12 Feb 2025
IIMedGPT: Promoting Large Language Model Capabilities of Medical Tasks by Efficient Human Preference Alignment
IIMedGPT: Promoting Large Language Model Capabilities of Medical Tasks by Efficient Human Preference Alignment
Yiming Zhang
Zheng Chang
Wentao Cai
MengXing Ren
Kang Yuan
Yining Sun
Zenghui Ding
LM&MA
113
3
0
06 Jan 2025
Deploying Multi-task Online Server with Large Language Model
Deploying Multi-task Online Server with Large Language Model
Yincen Qu
Chao Ma
Xiangying Dai
Hui Zhou
Yiting Wu
Hengyue Liu
81
0
0
06 Nov 2024
Layer by Layer: Uncovering Where Multi-Task Learning Happens in
  Instruction-Tuned Large Language Models
Layer by Layer: Uncovering Where Multi-Task Learning Happens in Instruction-Tuned Large Language Models
Zheng Zhao
Yftah Ziser
Shay B. Cohen
73
2
0
25 Oct 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Liwen Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
201
7
0
24 Oct 2024
Scalable Data Ablation Approximations for Language Models through
  Modular Training and Merging
Scalable Data Ablation Approximations for Language Models through Modular Training and Merging
Clara Na
Ian H. Magnusson
A. Jha
Tom Sherborne
Emma Strubell
Jesse Dodge
Pradeep Dasigi
MoMe
83
5
0
21 Oct 2024
Balancing Label Quantity and Quality for Scalable Elicitation
Balancing Label Quantity and Quality for Scalable Elicitation
Alex Troy Mallen
Nora Belrose
87
2
0
17 Oct 2024
Identifying Task Groupings for Multi-Task Learning Using Pointwise V-Usable Information
Identifying Task Groupings for Multi-Task Learning Using Pointwise V-Usable Information
Yingya Li
Timothy A. Miller
Steven Bethard
G. Savova
82
2
0
16 Oct 2024
CoBa: Convergence Balancer for Multitask Finetuning of Large Language
  Models
CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models
Zi Gong
Hang Yu
Cong Liao
Bingchang Liu
Chaoyu Chen
Jianguo Li
MoMe
63
5
0
09 Oct 2024
Mixing It Up: The Cocktail Effect of Multi-Task Fine-Tuning on LLM
  Performance -- A Case Study in Finance
Mixing It Up: The Cocktail Effect of Multi-Task Fine-Tuning on LLM Performance -- A Case Study in Finance
Meni Brief
Oded Ovadia
Gil Shenderovitz
Noga Ben Yoash
Rachel Lemberg
Eitam Sheetrit
89
4
0
01 Oct 2024
The Perfect Blend: Redefining RLHF with Mixture of Judges
The Perfect Blend: Redefining RLHF with Mixture of Judges
Tengyu Xu
Eryk Helenowski
Karthik Abinav Sankararaman
Di Jin
Kaiyan Peng
...
Gabriel Cohen
Yuandong Tian
Hao Ma
Sinong Wang
Han Fang
147
14
0
30 Sep 2024
Pre-Finetuning with Impact Duration Awareness for Stock Movement
  Prediction
Pre-Finetuning with Impact Duration Awareness for Stock Movement Prediction
Chr-Jr Chiu
Chung-Chi Chen
Hen-Hsen Huang
Hsin-Hsi Chen
AIFin
49
0
0
25 Sep 2024
Reducing the Cost: Cross-Prompt Pre-Finetuning for Short Answer Scoring
Reducing the Cost: Cross-Prompt Pre-Finetuning for Short Answer Scoring
Hiroaki Funayama
Yuya Asazuma
Yuichiroh Matsubayashi
Tomoya Mizumoto
Kentaro Inui
38
7
0
26 Aug 2024
Exploring the Latest LLMs for Leaderboard Extraction
Exploring the Latest LLMs for Leaderboard Extraction
Salomon Kabongo
Jennifer D'Souza
Sören Auer
72
2
0
06 Jun 2024
Pre-Calc: Learning to Use the Calculator Improves Numeracy in Language
  Models
Pre-Calc: Learning to Use the Calculator Improves Numeracy in Language Models
Vishruth Veerendranath
Vishwa Shah
Kshitish Ghate
108
0
0
22 Apr 2024
Multi-Task Learning for Features Extraction in Financial Annual Reports
Multi-Task Learning for Features Extraction in Financial Annual Reports
Syrielle Montariol
Matej Martinc
Andraz Pelicon
Senja Pollak
Boshko Koloski
Igor Loncarski
Aljoša Valentinčič
37
3
0
08 Apr 2024
Bridging Remote Sensors with Multisensor Geospatial Foundation Models
Bridging Remote Sensors with Multisensor Geospatial Foundation Models
Boran Han
Shuai Zhang
Xingjian Shi
Markus Reichstein
92
27
0
01 Apr 2024
Qibo: A Large Language Model for Traditional Chinese Medicine
Qibo: A Large Language Model for Traditional Chinese Medicine
Heyi Zhang
Xin Wang
Zhaopeng Meng
Zhe Chen
Pengwei Zhuang
Yongzhe Jia
Dawei Xu
Wenbin Guo
LM&MA
118
13
0
24 Mar 2024
MAGPIE: Multi-Task Media-Bias Analysis Generalization for Pre-Trained Identification of Expressions
MAGPIE: Multi-Task Media-Bias Analysis Generalization for Pre-Trained Identification of Expressions
Tomávs Horych
Martin Wessel
Jan Philip Wahle
Terry Ruas
Jerome Wassmuth
André Greiner-Petter
Akiko Aizawa
Bela Gipp
Timo Spinde
77
2
0
27 Feb 2024
PEMT: Multi-Task Correlation Guided Mixture-of-Experts Enables
  Parameter-Efficient Transfer Learning
PEMT: Multi-Task Correlation Guided Mixture-of-Experts Enables Parameter-Efficient Transfer Learning
Zhisheng Lin
Han Fu
Chenghao Liu
Zhuo Li
Jianling Sun
MoEMoMe
87
6
0
23 Feb 2024
MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning
MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning
Shu Yang
Muhammad Asif Ali
Cheng-Long Wang
Lijie Hu
Di Wang
CLLMoE
116
46
0
17 Feb 2024
Large Language Models for Scientific Information Extraction: An
  Empirical Study for Virology
Large Language Models for Scientific Information Extraction: An Empirical Study for Virology
Mahsa Shamsabadi
Jennifer D'Souza
Sören Auer
92
8
0
18 Jan 2024
One-Shot Learning as Instruction Data Prospector for Large Language
  Models
One-Shot Learning as Instruction Data Prospector for Large Language Models
Yunshui Li
Binyuan Hui
Xiaobo Xia
Jiaxi Yang
Min Yang
...
Ling-Hao Chen
Junhao Liu
Tongliang Liu
Fei Huang
Yongbin Li
133
36
0
16 Dec 2023
Multitask Learning Can Improve Worst-Group Outcomes
Multitask Learning Can Improve Worst-Group Outcomes
Atharva Kulkarni
Lucio Dery
Amrith Rajagopal Setlur
Aditi Raghunathan
Ameet Talwalkar
Graham Neubig
96
2
0
05 Dec 2023
DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines
DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines
Chenyu Jiang
Zhen Jia
Shuai Zheng
Yida Wang
Chuan Wu
MoEAI4CE
39
8
0
17 Nov 2023
Text generation for dataset augmentation in security classification
  tasks
Text generation for dataset augmentation in security classification tasks
Alexander P. Welsh
Matthew Edwards
53
1
0
22 Oct 2023
Controlled Randomness Improves the Performance of Transformer Models
Controlled Randomness Improves the Performance of Transformer Models
Tobias Deuβer
Cong Zhao
Wolfgang Krämer
David Leonhard
Christian Bauckhage
R. Sifa
66
1
0
20 Oct 2023
Enhancing Low-resource Fine-grained Named Entity Recognition by
  Leveraging Coarse-grained Datasets
Enhancing Low-resource Fine-grained Named Entity Recognition by Leveraging Coarse-grained Datasets
Su ah Lee
Seokjin Oh
Woohwan Jung
93
3
0
18 Oct 2023
Prototype-based HyperAdapter for Sample-Efficient Multi-task Tuning
Prototype-based HyperAdapter for Sample-Efficient Multi-task Tuning
Hao Zhao
Jie Fu
Zhaofeng He
169
6
0
18 Oct 2023
Lightweight In-Context Tuning for Multimodal Unified Models
Lightweight In-Context Tuning for Multimodal Unified Models
Yixin Chen
Shuai Zhang
Boran Han
Jiaya Jia
70
2
0
08 Oct 2023
PromptST: Prompt-Enhanced Spatio-Temporal Multi-Attribute Prediction
PromptST: Prompt-Enhanced Spatio-Temporal Multi-Attribute Prediction
Zijian Zhang
Xiangyu Zhao
Qidong Liu
Chunxu Zhang
Qian Ma
Wanyu Wang
Hongwei Zhao
Yiqi Wang
Zitao Liu
AI4TS
148
21
0
18 Sep 2023
Software Entity Recognition with Noise-Robust Learning
Software Entity Recognition with Noise-Robust Learning
Nguyen Tai
Yifeng Di
J. Lee
Muhao Chen
Tianyi Zhang
76
4
0
21 Aug 2023
Challenges and Opportunities of Using Transformer-Based Multi-Task
  Learning in NLP Through ML Lifecycle: A Survey
Challenges and Opportunities of Using Transformer-Based Multi-Task Learning in NLP Through ML Lifecycle: A Survey
Lovre Torbarina
Tin Ferkovic
Lukasz Roguski
Velimir Mihelčić
Bruno Šarlija
Z. Kraljevic
85
5
0
16 Aug 2023
Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language
  Model through Expert Feedback and Real-world Multi-turn Dialogue
Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-world Multi-turn Dialogue
Songhua Yang
Hanjia Zhao
Senbin Zhu
Guangyu Zhou
Hongfei Xu
Yuxiang Jia
Hongying Zan
AI4MHLM&MA
134
137
0
07 Aug 2023
A Knowledge-enhanced Two-stage Generative Framework for Medical Dialogue
  Information Extraction
A Knowledge-enhanced Two-stage Generative Framework for Medical Dialogue Information Extraction
Zefa Hu
Ziyi Ni
Jing Shi
Shuang Xu
Bo Xu
MedIm
98
2
0
30 Jul 2023
Dialogue Agents 101: A Beginner's Guide to Critical Ingredients for
  Designing Effective Conversational Systems
Dialogue Agents 101: A Beginner's Guide to Critical Ingredients for Designing Effective Conversational Systems
Shivani Kumar
S. Bhatia
Milan Aggarwal
Tanmoy Chakraborty
101
1
0
14 Jul 2023
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural
  Networks
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks
Liam Collins
Hamed Hassani
Mahdi Soltanolkotabi
Aryan Mokhtari
Sanjay Shakkottai
133
11
0
13 Jul 2023
Text Alignment Is An Efficient Unified Model for Massive NLP Tasks
Text Alignment Is An Efficient Unified Model for Massive NLP Tasks
Yuheng Zha
Yichi Yang
Ruichen Li
Zhiting Hu
ALM
90
11
0
06 Jul 2023
Meta-training with Demonstration Retrieval for Efficient Few-shot
  Learning
Meta-training with Demonstration Retrieval for Efficient Few-shot Learning
Aaron Mueller
Kanika Narang
Lambert Mathias
Qifan Wang
Hamed Firooz
RALM
94
3
0
30 Jun 2023
JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for
  Multi-task Mathematical Problem Solving
JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for Multi-task Mathematical Problem Solving
Wayne Xin Zhao
Kun Zhou
Beichen Zhang
Zheng Gong
Zhipeng Chen
...
Ji-Rong Wen
Jing Sha
Shijin Wang
Cong Liu
Guoping Hu
MoELRM
121
5
0
19 Jun 2023
Explore, Establish, Exploit: Red Teaming Language Models from Scratch
Explore, Establish, Exploit: Red Teaming Language Models from Scratch
Stephen Casper
Jason Lin
Joe Kwon
Gatlen Culp
Dylan Hadfield-Menell
AAML
74
99
0
15 Jun 2023
Generate to Understand for Representation
Generate to Understand for Representation
Changshan Xue
Xiande Zhong
Xiaoqing Liu
VLM
104
0
0
14 Jun 2023
Hexatagging: Projective Dependency Parsing as Tagging
Hexatagging: Projective Dependency Parsing as Tagging
Afra Amini
Tianyu Liu
Ryan Cotterell
VLM3DV
54
3
0
08 Jun 2023
ModuleFormer: Modularity Emerges from Mixture-of-Experts
ModuleFormer: Modularity Emerges from Mixture-of-Experts
Songlin Yang
Zheyu Zhang
Tianyou Cao
Shawn Tan
Zhenfang Chen
Chuang Gan
KELMMoE
82
10
0
07 Jun 2023
ConaCLIP: Exploring Distillation of Fully-Connected Knowledge
  Interaction Graph for Lightweight Text-Image Retrieval
ConaCLIP: Exploring Distillation of Fully-Connected Knowledge Interaction Graph for Lightweight Text-Image Retrieval
Jiapeng Wang
Chengyu Wang
Xiaodan Wang
Jun Huang
Lianwen Jin
VLM
121
5
0
28 May 2023
Do LLMs Understand Social Knowledge? Evaluating the Sociability of Large
  Language Models with SocKET Benchmark
Do LLMs Understand Social Knowledge? Evaluating the Sociability of Large Language Models with SocKET Benchmark
Minje Choi
Jiaxin Pei
Sagar Kumar
Chang Shu
David Jurgens
ALMLLMAG
148
72
0
24 May 2023
Few-shot Unified Question Answering: Tuning Models or Prompts?
Few-shot Unified Question Answering: Tuning Models or Prompts?
Srijan Bansal
Semih Yavuz
Bo Pang
Meghana Moorthy Bhat
Yingbo Zhou
108
2
0
23 May 2023
When Does Aggregating Multiple Skills with Multi-Task Learning Work? A
  Case Study in Financial NLP
When Does Aggregating Multiple Skills with Multi-Task Learning Work? A Case Study in Financial NLP
Jingwei Ni
Zhijing Jin
Qian Wang
Mrinmaya Sachan
Markus Leippold
AIFin
79
6
0
23 May 2023
1234
Next