ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.09368
  4. Cited By
Mixture-of-Experts with Expert Choice Routing

Mixture-of-Experts with Expert Choice Routing

18 February 2022
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
    MoE
ArXivPDFHTML

Papers citing "Mixture-of-Experts with Expert Choice Routing"

13 / 63 papers shown
Title
Enhancing Mobile Face Anti-Spoofing: A Robust Framework for Diverse
  Attack Types under Screen Flash
Enhancing Mobile Face Anti-Spoofing: A Robust Framework for Diverse Attack Types under Screen Flash
Weihua Liu
Chaochao Lin
Yunzhen Yan
CVBM
AAML
13
1
0
29 Aug 2023
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer
  with Mixture-of-View-Experts
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts
Wenyan Cong
Hanxue Liang
Peihao Wang
Zhiwen Fan
Tianlong Chen
M. Varma
Yi Wang
Zhangyang Wang
MoE
27
21
0
22 Aug 2023
Robust Mixture-of-Expert Training for Convolutional Neural Networks
Robust Mixture-of-Expert Training for Convolutional Neural Networks
Yihua Zhang
Ruisi Cai
Tianlong Chen
Guanhua Zhang
Huan Zhang
Pin-Yu Chen
Shiyu Chang
Zhangyang Wang
Sijia Liu
MoE
AAML
OOD
32
16
0
19 Aug 2023
From Sparse to Soft Mixtures of Experts
From Sparse to Soft Mixtures of Experts
J. Puigcerver
C. Riquelme
Basil Mustafa
N. Houlsby
MoE
121
114
0
02 Aug 2023
Pre-training Multi-task Contrastive Learning Models for Scientific
  Literature Understanding
Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding
Yu Zhang
Hao Cheng
Zhihong Shen
Xiaodong Liu
Yejiang Wang
Jianfeng Gao
24
13
0
23 May 2023
Perpetual Humanoid Control for Real-time Simulated Avatars
Perpetual Humanoid Control for Real-time Simulated Avatars
Zhengyi Luo
Jinkun Cao
Alexander W. Winkler
Kris M. Kitani
Weipeng Xu
44
88
0
10 May 2023
Scaling Expert Language Models with Unsupervised Domain Discovery
Scaling Expert Language Models with Unsupervised Domain Discovery
Suchin Gururangan
Margaret Li
M. Lewis
Weijia Shi
Tim Althoff
Noah A. Smith
Luke Zettlemoyer
MoE
17
46
0
24 Mar 2023
Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective
Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective
Michael E. Sander
J. Puigcerver
Josip Djolonga
Gabriel Peyré
Mathieu Blondel
16
18
0
02 Feb 2023
Dissociating language and thought in large language models
Dissociating language and thought in large language models
Kyle Mahowald
Anna A. Ivanova
I. Blank
Nancy Kanwisher
J. Tenenbaum
Evelina Fedorenko
ELM
ReLM
25
209
0
16 Jan 2023
Tricks for Training Sparse Translation Models
Tricks for Training Sparse Translation Models
Dheeru Dua
Shruti Bhosale
Vedanuj Goswami
James Cross
M. Lewis
Angela Fan
MoE
145
19
0
15 Oct 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
228
4,460
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,817
0
17 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,950
0
20 Apr 2018
Previous
12