ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.13189
  4. Cited By
MoBA: Mixture of Block Attention for Long-Context LLMs

MoBA: Mixture of Block Attention for Long-Context LLMs

18 February 2025
Enzhe Lu
Z. L. Jiang
Qingbin Liu
Yulun Du
Tao Jiang
Chao Hong
Shixuan Liu
Weiran He
Enming Yuan
Yuzhi Wang
Zhiqi Huang
Huan Yuan
Suting Xu
Xinran Xu
Guokun Lai
Yanru Chen
Huabin Zheng
Junjie Yan
Jianlin Su
Yuxin Wu
N. Zhang
Zhilin Yang
Xinyu Zhou
Mingxing Zhang
J. Qiu
ArXivPDFHTML

Papers citing "MoBA: Mixture of Block Attention for Long-Context LLMs"

16 / 16 papers shown
Title
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
Tianyu Fu
Yi Ge
Yichen You
Enshu Liu
Zhihang Yuan
Guohao Dai
Shengen Yan
Huazhong Yang
Yu Wang
MoE
LRM
62
0
0
27 May 2025
Understanding Transformer from the Perspective of Associative Memory
Understanding Transformer from the Perspective of Associative Memory
Shu Zhong
Mingyu Xu
Tenglong Ao
Guang Shi
35
0
0
26 May 2025
QwenLong-CPRS: Towards $\infty$-LLMs with Dynamic Context Optimization
QwenLong-CPRS: Towards ∞\infty∞-LLMs with Dynamic Context Optimization
Weizhou Shen
Chenliang Li
Fanqi Wan
Shengyi Liao
Shaopeng Lai
...
Bin Yang
Ji Zhang
Fei Huang
Jingren Zhou
Ming Yan
34
0
0
23 May 2025
Longer Context, Deeper Thinking: Uncovering the Role of Long-Context Ability in Reasoning
Wang Yang
Zirui Liu
Hongye Jin
Qingyu Yin
Vipin Chaudhary
Xiaotian Han
ReLM
LRM
48
0
0
22 May 2025
Training-Free Efficient Video Generation via Dynamic Token Carving
Training-Free Efficient Video Generation via Dynamic Token Carving
Yuechen Zhang
Jinbo Xing
Bin Xia
Shaoteng Liu
Bohao Peng
Xin Tao
Pengfei Wan
Eric Lo
Jiaya Jia
DiffM
VGen
59
0
0
22 May 2025
Enhancing Complex Instruction Following for Large Language Models with Mixture-of-Contexts Fine-tuning
Enhancing Complex Instruction Following for Large Language Models with Mixture-of-Contexts Fine-tuning
Yuheng Lu
ZiMeng Bai
Caixia Yuan
Huixing Jiang
Xiaojie Wang
LRM
86
0
0
17 May 2025
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems
Yao Fu
Yao Fu
Yeqi Huang
Ping Nie
Zhan Lu
...
Dayou Du
Tairan Xu
Dayou Du
Edoardo Ponti
Luo Mai
MoE
92
0
0
16 May 2025
SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization
SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization
Huashan Sun
Shengyi Liao
Yansen Han
Yu Bai
Yang Gao
...
Weizhou Shen
Fanqi Wan
Ming Yan
J.N. Zhang
Fei Huang
116
0
0
16 May 2025
WuNeng: Hybrid State with Attention
WuNeng: Hybrid State with Attention
Liu Xiao
Li Zhiyuan
Lin Yueyu
384
0
0
27 Apr 2025
PARD: Accelerating LLM Inference with Low-Cost PARallel Draft Model Adaptation
PARD: Accelerating LLM Inference with Low-Cost PARallel Draft Model Adaptation
Zihao An
Huajun Bai
Ziqiang Liu
Dong Li
E. Barsoum
127
0
0
23 Apr 2025
Random Long-Context Access for Mamba via Hardware-aligned Hierarchical Sparse Attention
Random Long-Context Access for Mamba via Hardware-aligned Hierarchical Sparse Attention
Xiang Hu
Jiaqi Leng
Jun Zhao
Kewei Tu
Wei Wu
Mamba
97
0
0
23 Apr 2025
MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention
MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention
Yucheng Li
Huiqiang Jiang
Chengruidong Zhang
Qianhui Wu
Xufang Luo
...
Amir H. Abdi
Dongsheng Li
Jianfeng Gao
Yue Yang
Lili Qiu
81
2
0
22 Apr 2025
Adaptive Computation Pruning for the Forgetting Transformer
Adaptive Computation Pruning for the Forgetting Transformer
Zhixuan Lin
J. Obando-Ceron
Xu Owen He
Rameswar Panda
59
2
0
09 Apr 2025
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
Xiaoye Qu
Yafu Li
Zhaochen Su
Weigao Sun
Jianhao Yan
...
Chaochao Lu
Yue Zhang
Xian-Sheng Hua
Bowen Zhou
Yu Cheng
ReLM
OffRL
LRM
162
42
0
27 Mar 2025
Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer
Yujiao Yang
Jing Lian
Linhui Li
MoE
104
0
0
04 Mar 2025
Training-free and Adaptive Sparse Attention for Efficient Long Video Generation
Training-free and Adaptive Sparse Attention for Efficient Long Video Generation
Yifei Xia
Suhan Ling
Fangcheng Fu
Yijiao Wang
Huixia Li
Xuefeng Xiao
Tengjiao Wang
VGen
121
7
0
28 Feb 2025
1