ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.12701
  4. Cited By
Eliciting and Understanding Cross-Task Skills with Task-Level
  Mixture-of-Experts

Eliciting and Understanding Cross-Task Skills with Task-Level Mixture-of-Experts

25 May 2022
Qinyuan Ye
Juan Zha
Xiang Ren
    MoE
ArXivPDFHTML

Papers citing "Eliciting and Understanding Cross-Task Skills with Task-Level Mixture-of-Experts"

50 / 112 papers shown
Title
Unsupervised Cross-Task Generalization via Retrieval Augmentation
Unsupervised Cross-Task Generalization via Retrieval Augmentation
Bill Yuchen Lin
Kangmin Tan
Chris Miller
Beiwen Tian
Xiang Ren
LRM
RALM
66
49
0
17 Apr 2022
Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners
Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners
Shashank Gupta
Subhabrata Mukherjee
K. Subudhi
Eduardo Gonzalez
Damien Jose
Ahmed Hassan Awadallah
Jianfeng Gao
MoE
67
50
0
16 Apr 2022
Combining Modular Skills in Multitask Learning
Combining Modular Skills in Multitask Learning
Edoardo Ponti
Alessandro Sordoni
Yoshua Bengio
Siva Reddy
MoE
58
37
0
28 Feb 2022
PromptSource: An Integrated Development Environment and Repository for
  Natural Language Prompts
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Stephen H. Bach
Victor Sanh
Zheng-Xin Yong
Albert Webson
Colin Raffel
...
Khalid Almubarak
Xiangru Tang
Dragomir R. Radev
Mike Tian-Jian Jiang
Alexander M. Rush
VLM
322
348
0
02 Feb 2022
Grad2Task: Improved Few-shot Text Classification Using Gradients for
  Task Representation
Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation
Jixuan Wang
Kuan-Chieh Wang
Frank Rudzicz
M. Brudno
VLM
50
21
0
27 Jan 2022
Efficient Large Scale Language Modeling with Mixtures of Experts
Efficient Large Scale Language Modeling with Mixtures of Experts
Mikel Artetxe
Shruti Bhosale
Naman Goyal
Todor Mihaylov
Myle Ott
...
Jeff Wang
Luke Zettlemoyer
Mona T. Diab
Zornitsa Kozareva
Ves Stoyanov
MoE
183
196
0
20 Dec 2021
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Nan Du
Yanping Huang
Andrew M. Dai
Simon Tong
Dmitry Lepikhin
...
Kun Zhang
Quoc V. Le
Yonghui Wu
Zhiwen Chen
Claire Cui
ALM
MoE
216
813
0
13 Dec 2021
ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
V. Aribandi
Yi Tay
Tal Schuster
J. Rao
H. Zheng
...
Jianmo Ni
Jai Gupta
Kai Hui
Sebastian Ruder
Donald Metzler
MoE
86
215
0
22 Nov 2021
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
342
1,702
0
15 Oct 2021
Beyond Distillation: Task-level Mixture-of-Experts for Efficient
  Inference
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Sneha Kudugunta
Yanping Huang
Ankur Bapna
M. Krikun
Dmitry Lepikhin
Minh-Thang Luong
Orhan Firat
MoE
245
109
0
24 Sep 2021
Datasets: A Community Library for Natural Language Processing
Datasets: A Community Library for Natural Language Processing
Quentin Lhoest
Albert Villanova del Moral
Yacine Jernite
A. Thakur
Patrick von Platen
...
Thibault Goehringer
Victor Mustar
François Lagunas
Alexander M. Rush
Thomas Wolf
216
610
0
07 Sep 2021
Finetuned Language Models Are Zero-Shot Learners
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALM
UQCV
198
3,750
0
03 Sep 2021
The Devil is in the Detail: Simple Tricks Improve Systematic
  Generalization of Transformers
The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers
Róbert Csordás
Kazuki Irie
Jürgen Schmidhuber
ViT
68
133
0
26 Aug 2021
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in
  NLP
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
Qinyuan Ye
Bill Yuchen Lin
Xiang Ren
286
184
0
18 Apr 2021
Cross-Task Generalization via Natural Language Crowdsourcing
  Instructions
Cross-Task Generalization via Natural Language Crowdsourcing Instructions
Swaroop Mishra
Daniel Khashabi
Chitta Baral
Hannaneh Hajishirzi
LRM
145
747
0
18 Apr 2021
Muppet: Massive Multi-task Representations with Pre-Finetuning
Muppet: Massive Multi-task Representations with Pre-Finetuning
Armen Aghajanyan
Anchit Gupta
Akshat Shrivastava
Xilun Chen
Luke Zettlemoyer
Sonal Gupta
70
268
0
26 Jan 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple
  and Efficient Sparsity
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
85
2,187
0
11 Jan 2021
HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection
HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection
Binny Mathew
Punyajoy Saha
Seid Muhie Yimam
Chris Biemann
Pawan Goyal
Animesh Mukherjee
114
578
0
18 Dec 2020
CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims
CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims
Thomas Diggelmann
Jordan L. Boyd-Graber
Jannis Bulian
Massimiliano Ciaramita
Markus Leippold
79
203
0
01 Dec 2020
Semi-supervised URL Segmentation with Recurrent Neural Networks
  Pre-trained on Knowledge Graph Entities
Semi-supervised URL Segmentation with Recurrent Neural Networks Pre-trained on Knowledge Graph Entities
Hao Zhang
Jae Hun Ro
R. Sproat
35
13
0
05 Nov 2020
Investigating Societal Biases in a Poetry Composition System
Investigating Societal Biases in a Poetry Composition System
Emily Sheng
David C. Uthus
67
53
0
05 Nov 2020
What Does This Acronym Mean? Introducing a New Dataset for Acronym
  Identification and Disambiguation
What Does This Acronym Mean? Introducing a New Dataset for Acronym Identification and Disambiguation
Amir Pouran Ben Veyseh
Franck Dernoncourt
Quan Hung Tran
Thien Huu Nguyen
45
55
0
28 Oct 2020
TweetEval: Unified Benchmark and Comparative Evaluation for Tweet
  Classification
TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification
Francesco Barbieri
Jose Camacho-Collados
Leonardo Neves
Luis Espinosa-Anke
VLM
84
721
0
23 Oct 2020
Explainable Automated Fact-Checking for Public Health Claims
Explainable Automated Fact-Checking for Public Health Claims
Neema Kotonya
Francesca Toni
254
261
0
19 Oct 2020
MOCHA: A Dataset for Training and Evaluating Generative Reading
  Comprehension Metrics
MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension Metrics
Anthony Chen
Gabriel Stanovsky
Sameer Singh
Matt Gardner
66
50
0
07 Oct 2020
"I'd rather just go to bed": Understanding Indirect Answers
"I'd rather just go to bed": Understanding Indirect Answers
Annie Louis
Dan Roth
Filip Radlinski
53
44
0
07 Oct 2020
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked
  Language Models
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
Nikita Nangia
Clara Vania
Rasika Bhalerao
Samuel R. Bowman
118
682
0
30 Sep 2020
Measuring Massive Multitask Language Understanding
Measuring Massive Multitask Language Understanding
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
D. Song
Jacob Steinhardt
ELM
RALM
176
4,434
0
07 Sep 2020
Effective Transfer Learning for Identifying Similar Questions: Matching
  User Questions to COVID-19 FAQs
Effective Transfer Learning for Identifying Similar Questions: Matching User Questions to COVID-19 FAQs
Clara H. McCreery
Namit Katariya
A. Kannan
Manish Chablani
X. Amatriain
MedIm
OOD
37
74
0
04 Aug 2020
BIOMRC: A Dataset for Biomedical Machine Reading Comprehension
BIOMRC: A Dataset for Biomedical Machine Reading Comprehension
Petros Stavropoulos
Dimitris Pappas
Ion Androutsopoulos
Ryan T. McDonald
51
51
0
13 May 2020
How Context Affects Language Models' Factual Predictions
How Context Affects Language Models' Factual Predictions
Fabio Petroni
Patrick Lewis
Aleksandra Piktus
Tim Rocktaschel
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
51
239
0
10 May 2020
Neural CRF Model for Sentence Alignment in Text Simplification
Neural CRF Model for Sentence Alignment in Text Simplification
Chao Jiang
Mounica Maddela
Wuwei Lan
Yang Zhong
Wenyuan Xu
63
161
0
05 May 2020
ProtoQA: A Question Answering Dataset for Prototypical Common-Sense
  Reasoning
ProtoQA: A Question Answering Dataset for Prototypical Common-Sense Reasoning
Michael Boratko
Xiang Lorraine Li
Rajarshi Das
Timothy J. O'Gorman
Daniel Le
Andrew McCallum
80
58
0
02 May 2020
UnifiedQA: Crossing Format Boundaries With a Single QA System
UnifiedQA: Crossing Format Boundaries With a Single QA System
Daniel Khashabi
Sewon Min
Tushar Khot
Ashish Sabharwal
Oyvind Tafjord
Peter Clark
Hannaneh Hajishirzi
136
739
0
02 May 2020
Birds have four legs?! NumerSense: Probing Numerical Commonsense
  Knowledge of Pre-trained Language Models
Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models
Bill Yuchen Lin
Seyeon Lee
Rahul Khanna
Xiang Ren
AIMat
55
158
0
02 May 2020
Intermediate-Task Transfer Learning with Pretrained Models for Natural
  Language Understanding: When and Why Does It Work?
Intermediate-Task Transfer Learning with Pretrained Models for Natural Language Understanding: When and Why Does It Work?
Yada Pruksachatkun
Jason Phang
Haokun Liu
Phu Mon Htut
Xiaoyi Zhang
Richard Yuanzhe Pang
Clara Vania
Katharina Kann
Samuel R. Bowman
CLL
LRM
62
197
0
01 May 2020
Beat the AI: Investigating Adversarial Human Annotation for Reading
  Comprehension
Beat the AI: Investigating Adversarial Human Annotation for Reading Comprehension
Max Bartolo
A. Roberts
Johannes Welbl
Sebastian Riedel
Pontus Stenetorp
AAML
107
175
0
02 Feb 2020
Break It Down: A Question Understanding Benchmark
Break It Down: A Question Understanding Benchmark
Tomer Wolfson
Mor Geva
Ankit Gupta
Matt Gardner
Yoav Goldberg
Daniel Deutch
Jonathan Berant
75
188
0
31 Jan 2020
BLiMP: The Benchmark of Linguistic Minimal Pairs for English
BLiMP: The Benchmark of Linguistic Minimal Pairs for English
Alex Warstadt
Alicia Parrish
Haokun Liu
Anhad Mohananey
Wei Peng
Sheng-Fu Wang
Samuel R. Bowman
75
492
0
02 Dec 2019
SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive
  Summarization
SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive Summarization
Bogdan Gliwa
Iwona Mochol
M. Biesek
A. Wawer
122
631
0
27 Nov 2019
PIQA: Reasoning about Physical Commonsense in Natural Language
PIQA: Reasoning about Physical Commonsense in Natural Language
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OOD
LRM
144
1,806
0
26 Nov 2019
Semantic Noise Matters for Neural Natural Language Generation
Semantic Noise Matters for Neural Natural Language Generation
Ondrej Dusek
David M. Howcroft
Verena Rieser
71
118
0
10 Nov 2019
Adversarial NLI: A New Benchmark for Natural Language Understanding
Adversarial NLI: A New Benchmark for Natural Language Understanding
Yixin Nie
Adina Williams
Emily Dinan
Joey Tianyi Zhou
Jason Weston
Douwe Kiela
125
1,006
0
31 Oct 2019
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language
  Generation, Translation, and Comprehension
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
M. Lewis
Yinhan Liu
Naman Goyal
Marjan Ghazvininejad
Abdel-rahman Mohamed
Omer Levy
Veselin Stoyanov
Luke Zettlemoyer
AIMat
VLM
249
10,829
0
29 Oct 2019
QASC: A Dataset for Question Answering via Sentence Composition
QASC: A Dataset for Question Answering via Sentence Composition
Tushar Khot
Peter Clark
Michal Guerquin
Peter Alexander Jansen
Ashish Sabharwal
CoGe
73
328
0
25 Oct 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
427
20,181
0
23 Oct 2019
WIQA: A dataset for "What if..." reasoning over procedural text
WIQA: A dataset for "What if..." reasoning over procedural text
Niket Tandon
Bhavana Dalvi
Keisuke Sakaguchi
Antoine Bosselut
Peter Clark
61
101
0
10 Sep 2019
QuaRTz: An Open-Domain Dataset of Qualitative Relationship Questions
QuaRTz: An Open-Domain Dataset of Qualitative Relationship Questions
Oyvind Tafjord
Matt Gardner
Kevin Lin
Peter Clark
57
108
0
08 Sep 2019
"Going on a vacation" takes longer than "Going for a walk": A Study of
  Temporal Commonsense Understanding
"Going on a vacation" takes longer than "Going for a walk": A Study of Temporal Commonsense Understanding
Ben Zhou
Daniel Khashabi
Qiang Ning
Dan Roth
AIMat
92
197
0
06 Sep 2019
TabFact: A Large-scale Dataset for Table-based Fact Verification
TabFact: A Large-scale Dataset for Table-based Fact Verification
Wenhu Chen
Hongmin Wang
Jianshu Chen
Yunkai Zhang
Hong Wang
Shiyang Li
Xiyou Zhou
William Yang Wang
LMTD
106
509
0
05 Sep 2019
123
Next