ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.00879
  4. Cited By
Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts

Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts

2 September 2024
Youngseog Chung
Dhruv Malik
J. Schneider
Yuanzhi Li
Aarti Singh
    MoE
ArXivPDFHTML

Papers citing "Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts"

6 / 6 papers shown
Title
Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL
Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL
Ghada Sokar
J. Obando-Ceron
Aaron C. Courville
Hugo Larochelle
Pablo Samuel Castro
MoE
127
2
0
02 Oct 2024
On Least Square Estimation in Softmax Gating Mixture of Experts
On Least Square Estimation in Softmax Gating Mixture of Experts
Huy Nguyen
Nhat Ho
Alessandro Rinaldo
51
13
0
05 Feb 2024
Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture
  of Adapters
Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters
Umberto Cappellazzo
Daniele Falavigna
A. Brutti
MoE
38
2
0
01 Feb 2024
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
160
327
0
18 Feb 2022
What is the State of Neural Network Pruning?
What is the State of Neural Network Pruning?
Davis W. Blalock
Jose Javier Gonzalez Ortiz
Jonathan Frankle
John Guttag
191
1,027
0
06 Mar 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
249
4,489
0
23 Jan 2020
1