ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.01771
  4. Cited By
BlackMamba: Mixture of Experts for State-Space Models

BlackMamba: Mixture of Experts for State-Space Models

1 February 2024
Quentin G. Anthony
Yury Tokpanov
Paolo Glorioso
Beren Millidge
ArXivPDFHTML

Papers citing "BlackMamba: Mixture of Experts for State-Space Models"

6 / 6 papers shown
Title
Understanding the Performance and Estimating the Cost of LLM Fine-Tuning
Understanding the Performance and Estimating the Cost of LLM Fine-Tuning
Yuchen Xia
Jiho Kim
Yuhan Chen
Haojie Ye
Souvik Kundu
Cong
Hao
Nishil Talati
MoE
35
20
0
08 Aug 2024
MambaLRP: Explaining Selective State Space Sequence Models
MambaLRP: Explaining Selective State Space Sequence Models
F. Jafari
G. Montavon
Klaus-Robert Müller
Oliver Eberle
Mamba
62
9
0
11 Jun 2024
CMViM: Contrastive Masked Vim Autoencoder for 3D Multi-modal
  Representation Learning for AD classification
CMViM: Contrastive Masked Vim Autoencoder for 3D Multi-modal Representation Learning for AD classification
Guangqian Yang
Kangrui Du
Zhihan Yang
Ye Du
Yongping Zheng
Shujun Wang
42
16
0
25 Mar 2024
Zoology: Measuring and Improving Recall in Efficient Language Models
Zoology: Measuring and Improving Recall in Efficient Language Models
Simran Arora
Sabri Eyuboglu
Aman Timalsina
Isys Johnson
Michael Poli
James Zou
Atri Rudra
Christopher Ré
64
66
0
08 Dec 2023
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
279
1,996
0
31 Dec 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,821
0
17 Sep 2019
1