ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.00315
  4. Cited By
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing

Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing

1 May 2025
Piotr Piekos
Róbert Csordás
Jürgen Schmidhuber
    MoE
    VLM
ArXivPDFHTML

Papers citing "Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing"

1 / 1 papers shown
Title
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Zihan Qiu
Zekun Wang
Bo Zheng
Zeyu Huang
Kaiyue Wen
...
Fei Huang
Suozhi Huang
Dayiheng Liu
Jingren Zhou
Junyang Lin
MoE
28
0
0
10 May 2025
1