Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.07137
Cited By
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
10 March 2025
Siyuan Mu
Sen Lin
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications"
50 / 203 papers shown
Title
Towards a Human-like Open-Domain Chatbot
Daniel De Freitas
Minh-Thang Luong
David R. So
Jamie Hall
Noah Fiedel
...
Zi Yang
Apoorv Kulshreshtha
Gaurav Nemade
Yifeng Lu
Quoc V. Le
91
935
0
27 Jan 2020
A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning
Soochan Lee
Junsoo Ha
Dongsu Zhang
Gunhee Kim
BDL
CLL
80
211
0
03 Jan 2020
Private Federated Learning with Domain Adaptation
Daniel W. Peterson
Pallika H. Kanani
Virendra J. Marathe
FedML
38
81
0
13 Dec 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
392
20,114
0
23 Oct 2019
Reinforcement Learning in Healthcare: A Survey
Chao Yu
Jiming Liu
S. Nemati
LM&MA
OffRL
173
570
0
22 Aug 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
543
24,422
0
26 Jul 2019
Convergence Rates for Gaussian Mixtures of Experts
Nhat Ho
Chiao-Yu Yang
Michael I. Jordan
38
41
0
09 Jul 2019
MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies
Xue Bin Peng
Michael Chang
Grace Zhang
Pieter Abbeel
Sergey Levine
57
197
0
23 May 2019
Towards Universal Object Detection by Domain Attention
Xudong Wang
Zhaowei Cai
Dashan Gao
Nuno Vasconcelos
OOD
70
196
0
09 Apr 2019
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables
Kate Rakelly
Aurick Zhou
Deirdre Quillen
Chelsea Finn
Sergey Levine
OffRL
78
656
0
19 Mar 2019
One-Shot Federated Learning
Neel Guha
Ameet Talwalkar
Virginia Smith
FedML
55
216
0
28 Feb 2019
Improving Adversarial Robustness of Ensembles with Diversity Training
Sanjay Kariyappa
Moinuddin K. Qureshi
AAML
FedML
42
135
0
28 Jan 2019
Improving Adversarial Robustness via Promoting Ensemble Diversity
Tianyu Pang
Kun Xu
Chao Du
Ning Chen
Jun Zhu
AAML
60
437
0
25 Jan 2019
Dropout Regularization in Hierarchical Mixture of Experts
Ozan Irsoy
Ethem Alpaydin
BDL
24
15
0
25 Dec 2018
Task-Free Continual Learning
Rahaf Aljundi
Klaas Kelchtermans
Tinne Tuytelaars
CLL
120
359
0
10 Dec 2018
Meta-Transfer Learning for Few-Shot Learning
Qianru Sun
Yaoyao Liu
Tat-Seng Chua
Bernt Schiele
199
1,070
0
06 Dec 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.7K
94,729
0
11 Oct 2018
Multi-Source Domain Adaptation with Mixture of Experts
Jiang Guo
Darsh J. Shah
Regina Barzilay
43
178
0
07 Sep 2018
Neural Processes
M. Garnelo
Jonathan Richard Schwarz
Dan Rosenbaum
Fabio Viola
Danilo Jimenez Rezende
S. M. Ali Eslami
Yee Whye Teh
BDL
UQCV
GP
87
514
0
04 Jul 2018
MEGAN: Mixture of Experts of Generative Adversarial Networks for Multimodal Image Generation
D. Park
Seungjoo Yoo
Hyojin Bahng
Jaegul Choo
Noseong Park
GAN
34
24
0
07 May 2018
Meta-Learning for Semi-Supervised Few-Shot Classification
Mengye Ren
Eleni Triantafillou
S. S. Ravi
Jake C. Snell
Kevin Swersky
J. Tenenbaum
Hugo Larochelle
R. Zemel
SSL
65
1,283
0
02 Mar 2018
Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning
Vladimir Feinberg
Alvin Wan
Ion Stoica
Michael I. Jordan
Joseph E. Gonzalez
Sergey Levine
OffRL
56
317
0
28 Feb 2018
Model-Ensemble Trust-Region Policy Optimization
Thanard Kurutach
I. Clavera
Yan Duan
Aviv Tamar
Pieter Abbeel
78
451
0
28 Feb 2018
Deep Learning for Sentiment Analysis : A Survey
Lei Zhang
Shuai Wang
Bing-Quan Liu
VLM
93
1,619
0
24 Jan 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
290
8,329
0
04 Jan 2018
Learning Sparse Neural Networks through
L
0
L_0
L
0
Regularization
Christos Louizos
Max Welling
Diederik P. Kingma
417
1,144
0
04 Dec 2017
Person Transfer GAN to Bridge Domain Gap for Person Re-Identification
Longhui Wei
Shiliang Zhang
Wen Gao
Q. Tian
GAN
91
1,668
0
23 Nov 2017
Recent Trends in Deep Learning Based Natural Language Processing
Tom Young
Devamanyu Hazarika
Soujanya Poria
Min Zhang
73
2,835
0
09 Aug 2017
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
448
19,006
0
20 Jul 2017
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
Chen Sun
Abhinav Shrivastava
Saurabh Singh
Abhinav Gupta
VLM
180
2,393
0
10 Jul 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
658
131,414
0
12 Jun 2017
Hard Mixtures of Experts for Large Scale Weakly Supervised Vision
Sam Gross
MarcÁurelio Ranzato
Arthur Szlam
MoE
43
102
0
20 Apr 2017
Prototypical Networks for Few-shot Learning
Jake C. Snell
Kevin Swersky
R. Zemel
287
8,129
0
15 Mar 2017
What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?
Alex Kendall
Y. Gal
BDL
OOD
UD
UQCV
PER
348
4,704
0
15 Mar 2017
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
806
11,894
0
09 Mar 2017
Learning a Unified Control Policy for Safe Falling
Visak C. V. Kumar
Sehoon Ha
Karen Liu
33
19
0
08 Mar 2017
Robustness to Adversarial Examples through an Ensemble of Specialists
Mahdieh Abbasi
Christian Gagné
AAML
79
109
0
22 Feb 2017
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Noam M. Shazeer
Azalia Mirhoseini
Krzysztof Maziarz
Andy Davis
Quoc V. Le
Geoffrey E. Hinton
J. Dean
MoE
244
2,635
0
23 Jan 2017
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
788
5,806
0
05 Dec 2016
Overcoming catastrophic forgetting in neural networks
J. Kirkpatrick
Razvan Pascanu
Neil C. Rabinowitz
J. Veness
Guillaume Desjardins
...
A. Grabska-Barwinska
Demis Hassabis
Claudia Clopath
D. Kumaran
R. Hadsell
CLL
339
7,498
0
02 Dec 2016
Expert Gate: Lifelong Learning with a Network of Experts
Rahaf Aljundi
Punarjay Chakravarty
Tinne Tuytelaars
CLL
75
660
0
18 Nov 2016
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Zhiwen Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
889
6,787
0
26 Sep 2016
Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer
Coline Devin
Abhishek Gupta
Trevor Darrell
Pieter Abbeel
Sergey Levine
OffRL
82
397
0
22 Sep 2016
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
C. Ledig
Lucas Theis
Ferenc Huszár
Jose Caballero
Andrew Cunningham
...
Andrew P. Aitken
Alykhan Tejani
J. Totz
Zehan Wang
Wenzhe Shi
GAN
240
10,686
0
15 Sep 2016
Matching Networks for One Shot Learning
Oriol Vinyals
Charles Blundell
Timothy Lillicrap
Koray Kavukcuoglu
Daan Wierstra
VLM
353
7,316
0
13 Jun 2016
Neural Architectures for Named Entity Recognition
Guillaume Lample
Miguel Ballesteros
Sandeep Subramanian
Kazuya Kawakami
Chris Dyer
219
2
0
04 Mar 2016
Communication-Efficient Learning of Deep Networks from Decentralized Data
H. B. McMahan
Eider Moore
Daniel Ramage
S. Hampson
Blaise Agüera y Arcas
FedML
380
17,453
0
17 Feb 2016
A Universal Approximation Theorem for Mixture of Experts Models
Hien Nguyen
Luke R. Lloyd‐Jones
Geoffrey J. McLachlan
30
41
0
11 Feb 2016
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
Alec Radford
Luke Metz
Soumith Chintala
GAN
OOD
243
14,005
0
19 Nov 2015
A Neural Attention Model for Abstractive Sentence Summarization
Alexander M. Rush
S. Chopra
Jason Weston
CVBM
180
2,700
0
02 Sep 2015
Previous
1
2
3
4
5
Next