ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.04361
  4. Cited By
Exploiting Activation Sparsity with Dense to Dynamic-k
  Mixture-of-Experts Conversion

Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion

6 October 2023
Filip Szatkowski
Eric Elmoznino
Younesse Kaddar
Simone Scardapane
    MoE
ArXivPDFHTML

Papers citing "Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion"

12 / 12 papers shown
Title
Sparse Brains are Also Adaptive Brains: Cognitive-Load-Aware Dynamic Activation for LLMs
Sparse Brains are Also Adaptive Brains: Cognitive-Load-Aware Dynamic Activation for LLMs
Yiheng Yang
Yujie Wang
Chi Ma
Lei Yu
Emmanuele Chersoni
Chu-Ren Huang
79
0
0
26 Feb 2025
First Activations Matter: Training-Free Methods for Dynamic Activation
  in Large Language Models
First Activations Matter: Training-Free Methods for Dynamic Activation in Large Language Models
Chi Ma
Mincong Huang
Ying Zhang
Chao Wang
Yujie Wang
Lei Yu
Chuan Liu
Wei Lin
AI4CE
LLMSV
53
2
0
21 Aug 2024
CATS: Contextually-Aware Thresholding for Sparsity in Large Language
  Models
CATS: Contextually-Aware Thresholding for Sparsity in Large Language Models
Je-Yong Lee
Donghyun Lee
Genghan Zhang
Mo Tiwari
Azalia Mirhoseini
44
15
0
12 Apr 2024
Massive Activations in Large Language Models
Massive Activations in Large Language Models
Mingjie Sun
Xinlei Chen
J. Zico Kolter
Zhuang Liu
74
68
0
27 Feb 2024
HiRE: High Recall Approximate Top-$k$ Estimation for Efficient LLM
  Inference
HiRE: High Recall Approximate Top-kkk Estimation for Efficient LLM Inference
Yashas Samaga
Varun Yerram
Chong You
Srinadh Bhojanapalli
Sanjiv Kumar
Prateek Jain
Praneeth Netrapalli
59
4
0
14 Feb 2024
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language
  Models
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models
Iman Mirzadeh
Keivan Alizadeh-Vahid
Sachin Mehta
C. C. D. Mundo
Oncel Tuzel
Golnoosh Samei
Mohammad Rastegari
Mehrdad Farajtabar
126
60
0
06 Oct 2023
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
160
329
0
18 Feb 2022
Sparsity in Deep Learning: Pruning and growth for efficient inference
  and training in neural networks
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
Torsten Hoefler
Dan Alistarh
Tal Ben-Nun
Nikoli Dryden
Alexandra Peste
MQ
141
684
0
31 Jan 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
227
2,431
0
04 Jan 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,489
0
23 Jan 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
296
39,217
0
01 Sep 2014
1