ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.04489
  4. Cited By
Parameter-efficient Multi-task Fine-tuning for Transformers via Shared
  Hypernetworks

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks

8 June 2021
Rabeeh Karimi Mahabadi
Sebastian Ruder
Mostafa Dehghani
James Henderson
    MoE
ArXivPDFHTML

Papers citing "Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks"

34 / 84 papers shown
Title
HMOE: Hypernetwork-based Mixture of Experts for Domain Generalization
HMOE: Hypernetwork-based Mixture of Experts for Domain Generalization
Jingang Qu
T. Faney
Zehao Wang
Patrick Gallinari
Soleiman Yousef
J. D. Hemptinne
OOD
24
7
0
15 Nov 2022
Evaluating Parameter Efficient Learning for Generation
Evaluating Parameter Efficient Learning for Generation
Peng Xu
M. Patwary
Shrimai Prabhumoye
Virginia Adams
R. Prenger
Ming-Yu Liu
Nayeon Lee
M. Shoeybi
Bryan Catanzaro
MoE
35
3
0
25 Oct 2022
Meta-learning Pathologies from Radiology Reports using Variance Aware
  Prototypical Networks
Meta-learning Pathologies from Radiology Reports using Variance Aware Prototypical Networks
Arijit Sehanobish
Kawshik Kannan
Nabila Abraham
Anasuya Das
Benjamin Odry
VLM
26
0
0
22 Oct 2022
Boosting Natural Language Generation from Instructions with
  Meta-Learning
Boosting Natural Language Generation from Instructions with Meta-Learning
Budhaditya Deb
Guoqing Zheng
Ahmed Hassan Awadallah
24
13
0
20 Oct 2022
Scaling & Shifting Your Features: A New Baseline for Efficient Model
  Tuning
Scaling & Shifting Your Features: A New Baseline for Efficient Model Tuning
Dongze Lian
Daquan Zhou
Jiashi Feng
Xinchao Wang
36
249
0
17 Oct 2022
UU-Tax at SemEval-2022 Task 3: Improving the generalizability of
  language models for taxonomy classification through data augmentation
UU-Tax at SemEval-2022 Task 3: Improving the generalizability of language models for taxonomy classification through data augmentation
I. Sarhan
P. Mosteiro
Marco Spruit
31
2
0
07 Oct 2022
Polyhistor: Parameter-Efficient Multi-Task Adaptation for Dense Vision
  Tasks
Polyhistor: Parameter-Efficient Multi-Task Adaptation for Dense Vision Tasks
Yen-Cheng Liu
Chih-Yao Ma
Junjiao Tian
Zijian He
Z. Kira
126
47
0
07 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
127
94
0
06 Oct 2022
Meta-Ensemble Parameter Learning
Meta-Ensemble Parameter Learning
Zhengcong Fei
Shuman Tian
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
OOD
44
2
0
05 Oct 2022
Towards Parameter-Efficient Integration of Pre-Trained Language Models
  In Temporal Video Grounding
Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding
Erica K. Shimomoto
Edison Marrese-Taylor
Hiroya Takamura
Ichiro Kobayashi
Hideki Nakayama
Yusuke Miyao
27
7
0
26 Sep 2022
PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model
  Adaptation
PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
VLM
CLL
34
41
0
22 Aug 2022
LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer
  Learning
LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning
Yi-Lin Sung
Jaemin Cho
Joey Tianyi Zhou
VLM
21
237
0
13 Jun 2022
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures
  of Soft Prompts
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts
Akari Asai
Mohammadreza Salehi
Matthew E. Peters
Hannaneh Hajishirzi
130
100
0
24 May 2022
When does Parameter-Efficient Transfer Learning Work for Machine
  Translation?
When does Parameter-Efficient Transfer Learning Work for Machine Translation?
Ahmet Üstün
Asa Cooper Stickland
37
7
0
23 May 2022
Lifting the Curse of Multilinguality by Pre-training Modular
  Transformers
Lifting the Curse of Multilinguality by Pre-training Modular Transformers
Jonas Pfeiffer
Naman Goyal
Xi Lin
Xian Li
James Cross
Sebastian Riedel
Mikel Artetxe
LRM
40
139
0
12 May 2022
Polyglot Prompt: Multilingual Multitask PrompTraining
Polyglot Prompt: Multilingual Multitask PrompTraining
Jinlan Fu
See-Kiong Ng
Pengfei Liu
29
7
0
29 Apr 2022
PERFECT: Prompt-free and Efficient Few-shot Learning with Language
  Models
PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models
Rabeeh Karimi Mahabadi
Luke Zettlemoyer
James Henderson
Marzieh Saeidi
Lambert Mathias
Ves Stoyanov
Majid Yazdani
VLM
34
69
0
03 Apr 2022
APG: Adaptive Parameter Generation Network for Click-Through Rate
  Prediction
APG: Adaptive Parameter Generation Network for Click-Through Rate Prediction
Bencheng Yan
Pengjie Wang
Kai Zhang
Feng Li
Hongbo Deng
Jian Xu
Bo Zheng
27
20
0
30 Mar 2022
Fine-tuning Image Transformers using Learnable Memory
Fine-tuning Image Transformers using Learnable Memory
Mark Sandler
A. Zhmoginov
Max Vladymyrov
Andrew Jackson
ViT
29
47
0
29 Mar 2022
Hyperdecoders: Instance-specific decoders for multi-task NLP
Hyperdecoders: Instance-specific decoders for multi-task NLP
Hamish Ivison
Matthew E. Peters
AI4CE
34
20
0
15 Mar 2022
Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for
  Pre-trained Language Models
Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models
Ning Ding
Yujia Qin
Guang Yang
Fu Wei
Zonghan Yang
...
Jianfei Chen
Yang Liu
Jie Tang
Juan Li
Maosong Sun
32
196
0
14 Mar 2022
HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both
  Language and Vision-and-Language Tasks
HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both Language and Vision-and-Language Tasks
Zhengkun Zhang
Wenya Guo
Xiaojun Meng
Yasheng Wang
Yadao Wang
Xin Jiang
Qun Liu
Zhenglu Yang
34
15
0
08 Mar 2022
Combining Modular Skills in Multitask Learning
Combining Modular Skills in Multitask Learning
Edoardo Ponti
Alessandro Sordoni
Yoshua Bengio
Siva Reddy
MoE
12
37
0
28 Feb 2022
Computing Multiple Image Reconstructions with a Single Hypernetwork
Computing Multiple Image Reconstructions with a Single Hypernetwork
Alan Q. Wang
Adrian Dalca
M. Sabuncu
38
8
0
22 Feb 2022
HyperTransformer: Model Generation for Supervised and Semi-Supervised
  Few-Shot Learning
HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning
A. Zhmoginov
Mark Sandler
Max Vladymyrov
ViT
33
68
0
11 Jan 2022
VL-Adapter: Parameter-Efficient Transfer Learning for
  Vision-and-Language Tasks
VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks
Yi-Lin Sung
Jaemin Cho
Joey Tianyi Zhou
VLM
VPVLM
37
343
0
13 Dec 2021
Pruning Pretrained Encoders with a Multitask Objective
Pruning Pretrained Encoders with a Multitask Objective
Patrick Xia
Richard Shin
47
0
0
10 Dec 2021
Training Neural Networks with Fixed Sparse Masks
Training Neural Networks with Fixed Sparse Masks
Yi-Lin Sung
Varun Nair
Colin Raffel
FedML
32
197
0
18 Nov 2021
The Efficiency Misnomer
The Efficiency Misnomer
Daoyuan Chen
Liuyi Yao
Dawei Gao
Ashish Vaswani
Yaliang Li
36
99
0
25 Oct 2021
Raise a Child in Large Language Model: Towards Effective and
  Generalizable Fine-tuning
Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning
Runxin Xu
Fuli Luo
Zhiyuan Zhang
Chuanqi Tan
Baobao Chang
Songfang Huang
Fei Huang
LRM
151
178
0
13 Sep 2021
Efficient Test Time Adapter Ensembling for Low-resource Language
  Varieties
Efficient Test Time Adapter Ensembling for Low-resource Language Varieties
Xinyi Wang
Yulia Tsvetkov
Sebastian Ruder
Graham Neubig
38
34
0
10 Sep 2021
Patient Outcome and Zero-shot Diagnosis Prediction with
  Hypernetwork-guided Multitask Learning
Patient Outcome and Zero-shot Diagnosis Prediction with Hypernetwork-guided Multitask Learning
Shaoxiong Ji
Pekka Marttinen
21
5
0
07 Sep 2021
Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Rabeeh Karimi Mahabadi
James Henderson
Sebastian Ruder
MoE
67
468
0
08 Jun 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
Previous
12