Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1406.7362
Cited By
Exponentially Increasing the Capacity-to-Computation Ratio for Conditional Computation in Deep Learning
28 June 2014
Kyunghyun Cho
Yoshua Bengio
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exponentially Increasing the Capacity-to-Computation Ratio for Conditional Computation in Deep Learning"
14 / 14 papers shown
Title
GradMDM: Adversarial Attack on Dynamic Networks
Jianhong Pan
Lin Geng Foo
Qichen Zheng
Zhipeng Fan
Hossein Rahmani
Qiuhong Ke
Xiaozhong Liu
AAML
16
6
0
01 Apr 2023
Memorization Capacity of Neural Networks with Conditional Computation
Erdem Koyuncu
38
4
0
20 Mar 2023
Spatial Mixture-of-Experts
Nikoli Dryden
Torsten Hoefler
MoE
34
9
0
24 Nov 2022
Switchable Representation Learning Framework with Self-compatibility
Shengsen Wu
Yan Bai
Yihang Lou
Xiongkun Linghu
Jianzhong He
Ling-yu Duan
22
1
0
16 Jun 2022
Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models
Yang Shu
Zhangjie Cao
Ziyang Zhang
Jianmin Wang
Mingsheng Long
17
4
0
08 Jun 2022
APG: Adaptive Parameter Generation Network for Click-Through Rate Prediction
Bencheng Yan
Pengjie Wang
Kai Zhang
Feng Li
Hongbo Deng
Jian Xu
Bo Zheng
27
20
0
30 Mar 2022
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
160
329
0
18 Feb 2022
Efficient Large Scale Language Modeling with Mixtures of Experts
Mikel Artetxe
Shruti Bhosale
Naman Goyal
Todor Mihaylov
Myle Ott
...
Jeff Wang
Luke Zettlemoyer
Mona T. Diab
Zornitsa Kozareva
Ves Stoyanov
MoE
61
188
0
20 Dec 2021
Zoo-Tuning: Adaptive Transfer from a Zoo of Models
Yang Shu
Zhi Kou
Zhangjie Cao
Jianmin Wang
Mingsheng Long
29
44
0
29 Jun 2021
Scaling Vision with Sparse Mixture of Experts
C. Riquelme
J. Puigcerver
Basil Mustafa
Maxim Neumann
Rodolphe Jenatton
André Susano Pinto
Daniel Keysers
N. Houlsby
MoE
17
575
0
10 Jun 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
11
2,075
0
11 Jan 2021
Conditional Computation for Continual Learning
Min-Bin Lin
Jie Fu
Yoshua Bengio
CLL
31
10
0
16 Jun 2019
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Noam M. Shazeer
Azalia Mirhoseini
Krzysztof Maziarz
Andy Davis
Quoc V. Le
Geoffrey E. Hinton
J. Dean
MoE
55
2,513
0
23 Jan 2017
Improving neural networks by preventing co-adaptation of feature detectors
Geoffrey E. Hinton
Nitish Srivastava
A. Krizhevsky
Ilya Sutskever
Ruslan Salakhutdinov
VLM
266
7,638
0
03 Jul 2012
1