Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.06850
Cited By
Efficient Language Modeling with Sparse all-MLP
14 March 2022
Ping Yu
Mikel Artetxe
Myle Ott
Sam Shleifer
Hongyu Gong
Ves Stoyanov
Xian Li
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Language Modeling with Sparse all-MLP"
9 / 9 papers shown
Title
Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning
Yixiao Wang
Yifei Zhang
Mingxiao Huo
Ran Tian
Xiang Zhang
...
Chenfeng Xu
Pengliang Ji
Wei Zhan
Mingyu Ding
M. Tomizuka
MoE
36
18
0
01 Jul 2024
Turn Waste into Worth: Rectifying Top-
k
k
k
Router of MoE
Zhiyuan Zeng
Qipeng Guo
Zhaoye Fei
Zhangyue Yin
Yunhua Zhou
Linyang Li
Tianxiang Sun
Hang Yan
Dahua Lin
Xipeng Qiu
MoE
MoMe
25
4
0
17 Feb 2024
Model Compression and Efficient Inference for Large Language Models: A Survey
Wenxiao Wang
Wei Chen
Yicong Luo
Yongliu Long
Zhengkai Lin
Liye Zhang
Binbin Lin
Deng Cai
Xiaofei He
MQ
41
47
0
15 Feb 2024
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Fuzhao Xue
Zian Zheng
Yao Fu
Jinjie Ni
Zangwei Zheng
Wangchunshu Zhou
Yang You
MoE
20
87
0
29 Jan 2024
Setting the Record Straight on Transformer Oversmoothing
G. Dovonon
M. Bronstein
Matt J. Kusner
22
5
0
09 Jan 2024
Soft Merging of Experts with Adaptive Routing
Mohammed Muqeeth
Haokun Liu
Colin Raffel
MoMe
MoE
27
45
0
06 Jun 2023
Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models
Ze-Feng Gao
Peiyu Liu
Wayne Xin Zhao
Zhong-Yi Lu
Ji-Rong Wen
MoE
16
27
0
02 Mar 2022
pNLP-Mixer: an Efficient all-MLP Architecture for Language
Francesco Fusco
Damian Pascual
Peter W. J. Staar
Diego Antognini
37
29
0
09 Feb 2022
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
271
2,603
0
04 May 2021
1