ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.06363
  4. Cited By
Hard Mixtures of Experts for Large Scale Weakly Supervised Vision

Hard Mixtures of Experts for Large Scale Weakly Supervised Vision

20 April 2017
Sam Gross
MarcÁurelio Ranzato
Arthur Szlam
    MoE
ArXivPDFHTML

Papers citing "Hard Mixtures of Experts for Large Scale Weakly Supervised Vision"

29 / 29 papers shown
Title
Switch-Based Multi-Part Neural Network
Switch-Based Multi-Part Neural Network
Surajit Majumder
Paritosh Ranjan
Prodip Roy
Bhuban Padhan
OOD
74
0
0
25 Apr 2025
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design
Mohan Zhang
Pingzhi Li
Jie Peng
Mufan Qiu
Tianlong Chen
MoE
45
0
0
02 Apr 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
135
1
0
10 Mar 2025
No Need to Talk: Asynchronous Mixture of Language Models
No Need to Talk: Asynchronous Mixture of Language Models
Anastasiia Filippova
Angelos Katharopoulos
David Grangier
Ronan Collobert
MoE
39
0
0
04 Oct 2024
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Yongxin Guo
Zhenglin Cheng
Xiaoying Tang
Tao R. Lin
Tao Lin
MoE
64
7
0
23 May 2024
Towards Modular LLMs by Building and Reusing a Library of LoRAs
Towards Modular LLMs by Building and Reusing a Library of LoRAs
O. Ostapenko
Zhan Su
E. Ponti
Laurent Charlin
Nicolas Le Roux
Matheus Pereira
Lucas Caccia
Alessandro Sordoni
MoMe
41
31
0
18 May 2024
DiPaCo: Distributed Path Composition
DiPaCo: Distributed Path Composition
Arthur Douillard
Qixuang Feng
Andrei A. Rusu
A. Kuncoro
Yani Donchev
Rachita Chhaparia
Ionel Gog
MarcÁurelio Ranzato
Jiajun Shen
Arthur Szlam
MoE
48
2
0
15 Mar 2024
LLMBind: A Unified Modality-Task Integration Framework
LLMBind: A Unified Modality-Task Integration Framework
Bin Zhu
Munan Ning
Peng Jin
Bin Lin
Jinfa Huang
...
Junwu Zhang
Zhenyu Tang
Mingjun Pan
Xing Zhou
Li-ming Yuan
MLLM
34
6
0
22 Feb 2024
Multimodal Clinical Trial Outcome Prediction with Large Language Models
Multimodal Clinical Trial Outcome Prediction with Large Language Models
Wenhao Zheng
Dongsheng Peng
Hongxia Xu
Yun-Qing Li
Hongtu Zhu
Tianfan Fu
Huaxiu Yao
Huaxiu Yao
50
5
0
09 Feb 2024
HOPE: A Memory-Based and Composition-Aware Framework for Zero-Shot Learning with Hopfield Network and Soft Mixture of Experts
HOPE: A Memory-Based and Composition-Aware Framework for Zero-Shot Learning with Hopfield Network and Soft Mixture of Experts
Do Huu Dat
Po Yuan Mao
Tien Hoang Nguyen
Wray L. Buntine
Bennamoun
51
1
0
23 Nov 2023
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer
  with Mixture-of-View-Experts
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts
Wenyan Cong
Hanxue Liang
Peihao Wang
Zhiwen Fan
Tianlong Chen
M. Varma
Yi Wang
Zhangyang Wang
MoE
27
21
0
22 Aug 2023
Robust Mixture-of-Expert Training for Convolutional Neural Networks
Robust Mixture-of-Expert Training for Convolutional Neural Networks
Yihua Zhang
Ruisi Cai
Tianlong Chen
Guanhua Zhang
Huan Zhang
Pin-Yu Chen
Shiyu Chang
Zhangyang Wang
Sijia Liu
MoE
AAML
OOD
34
16
0
19 Aug 2023
Revisiting Single-gated Mixtures of Experts
Revisiting Single-gated Mixtures of Experts
Amelie Royer
I. Karmanov
Andrii Skliar
B. Bejnordi
Tijmen Blankevoort
MoE
MoMe
36
6
0
11 Apr 2023
Scaling Expert Language Models with Unsupervised Domain Discovery
Scaling Expert Language Models with Unsupervised Domain Discovery
Suchin Gururangan
Margaret Li
M. Lewis
Weijia Shi
Tim Althoff
Noah A. Smith
Luke Zettlemoyer
MoE
22
46
0
24 Mar 2023
Gated Self-supervised Learning For Improving Supervised Learning
Gated Self-supervised Learning For Improving Supervised Learning
Erland Hilman Fuadi
Aristo Renaldo Ruslim
Putu Wahyu Kusuma Wardhana
N. Yudistira
SSL
23
0
0
14 Jan 2023
Spatial Mixture-of-Experts
Spatial Mixture-of-Experts
Nikoli Dryden
Torsten Hoefler
MoE
34
9
0
24 Nov 2022
M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task
  Learning with Model-Accelerator Co-design
M3^33ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design
Hanxue Liang
Zhiwen Fan
Rishov Sarkar
Ziyu Jiang
Tianlong Chen
Kai Zou
Yu Cheng
Cong Hao
Zhangyang Wang
MoE
31
81
0
26 Oct 2022
Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from
  Mixture-of-Experts
Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-Experts
Tao Zhong
Zhixiang Chi
Li Gu
Yang Wang
Yuanhao Yu
Jingshan Tang
OOD
66
29
0
08 Oct 2022
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Andrew M. Saxe
Shagun Sodhani
Sam Lewallen
AI4CE
28
34
0
21 Jul 2022
Adaptive Mixture of Experts Learning for Generalizable Face
  Anti-Spoofing
Adaptive Mixture of Experts Learning for Generalizable Face Anti-Spoofing
Qianyu Zhou
Ke-Yue Zhang
Taiping Yao
Ran Yi
Shouhong Ding
Lizhuang Ma
OOD
CVBM
25
47
0
20 Jul 2022
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech
  recognition
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Zhao You
Shulin Feng
Dan Su
Dong Yu
16
9
0
07 Apr 2022
Efficient and Degradation-Adaptive Network for Real-World Image
  Super-Resolution
Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution
Jie Liang
Huiyu Zeng
Lei Zhang
SupR
30
87
0
27 Mar 2022
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
160
327
0
18 Feb 2022
Trust Your Robots! Predictive Uncertainty Estimation of Neural Networks
  with Sparse Gaussian Processes
Trust Your Robots! Predictive Uncertainty Estimation of Neural Networks with Sparse Gaussian Processes
Jongseo Lee
Jianxiang Feng
Matthias Humt
M. Müller
Rudolph Triebel
UQCV
48
21
0
20 Sep 2021
Scaling Vision with Sparse Mixture of Experts
Scaling Vision with Sparse Mixture of Experts
C. Riquelme
J. Puigcerver
Basil Mustafa
Maxim Neumann
Rodolphe Jenatton
André Susano Pinto
Daniel Keysers
N. Houlsby
MoE
17
575
0
10 Jun 2021
MOS: Towards Scaling Out-of-distribution Detection for Large Semantic
  Space
MOS: Towards Scaling Out-of-distribution Detection for Large Semantic Space
Rui Huang
Yixuan Li
OODD
36
235
0
05 May 2021
High-Capacity Expert Binary Networks
High-Capacity Expert Binary Networks
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
MQ
24
57
0
07 Oct 2020
Neural Data Server: A Large-Scale Search Engine for Transfer Learning
  Data
Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data
Xi Yan
David Acuna
Sanja Fidler
24
42
0
09 Jan 2020
Exploring the Limits of Weakly Supervised Pretraining
Exploring the Limits of Weakly Supervised Pretraining
D. Mahajan
Ross B. Girshick
Vignesh Ramanathan
Kaiming He
Manohar Paluri
Yixuan Li
Ashwin R. Bharambe
L. V. D. van der Maaten
VLM
68
1,356
0
02 May 2018
1