ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.04434
  4. Cited By
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts
  Language Model

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

7 May 2024
DeepSeek-AI
Aixin Liu
Bei Feng
Bin Wang
Bingxuan Wang
Bo Liu
Chenggang Zhao
Chengqi Dengr
Chong Ruan
Damai Dai
Daya Guo
Dejian Yang
Deli Chen
Dongjie Ji
Erhang Li
Fangyun Lin
Fuli Luo
Guangbo Hao
Guanting Chen
Guowei Li
Hai-Tao Zhang
Hanwei Xu
Hao Yang
Haowei Zhang
Honghui Ding
Huajian Xin
Huazuo Gao
Hui Li
Hui Qu
Jianfeng Cai
Jian Liang
Jianzhong Guo
Jiaqi Ni
Jiashi Li
Jin Chen
Jingyang Yuan
Junjie Qiu
Junxiao Song
Kai Dong
Kaige Gao
Kang Guan
Lean Wang
Lecong Zhang
Lei Xu
Leyi Xia
Liang Zhao
Liyue Zhang
Meng Li
Miaojun Wang
Mingchuan Zhang
Minghua Zhang
Minghui Tang
Mingming Li
Ning Tian
Panpan Huang
Peiyi Wang
Peng Zhang
Qihao Zhu
Qinyu Chen
Qiushi Du
Ruoxin Chen
Rong Jin
Ruiqi Ge
Ruizhe Pan
Runxin Xu
Ruyi Chen
S. S. Li
Shanghao Lu
Shangyan Zhou
Shanhuang Chen
Shaoqing Wu
Shengfeng Ye
Shirong Ma
Shiyu Wang
Shuang Zhou
Shuiping Yu
Shunfeng Zhou
Wenlei Bao
Tao Wang
Tian Pei
Tian Yuan
Tianyu Sun
W. L. Xiao
Wangding Zeng
Wei An
Wen Liu
Wenfeng Liang
Wenjun Gao
Wentao Zhang
X. Q. Li
Xiangyue Jin
Xianzu Wang
Xiao Bi
Xiaodong Liu
Xiaohan Wang
Xiaojin Shen
Xiaokang Chen
Xiaosha Chen
Xiaotao Nie
Xiaowen Sun
Xiaoxiang Wang
Xin Liu
Xin Xie
Xingkai Yu
Xinnan Song
Xinyi Zhou
Xinyu Yang
Xuan Lu
Xuecheng Su
Ying Wu
Y. K. Li
Y. X. Wei
Yichen Zhu
Yanhong Xu
Yanping Huang
Yao Li
Yao-Min Zhao
Yaofeng Sun
Yaohui Li
Yaohui Wang
Yi Zheng
Yichao Zhang
Yiliang Xiong
Yilong Zhao
Ying He
Ying Tang
Yishi Piao
Yixin Dong
Yixuan Tan
Yiyuan Liu
Yongji Wang
Yongqiang Guo
Yuchen Zhu
Yuduan Wang
Yuheng Zou
Yukun Zha
Yunxian Ma
Yuting Yan
Yuxiang You
Yuxuan Liu
Z. Z. Ren
Zehui Ren
Zhangli Sha
Zhe Fu
Zhen Huang
Zhen Zhang
Zhenda Xie
Zhewen Hao
Zhihong Shao
Zhiniu Wen
Zhipeng Xu
Zhongyu Zhang
Zhuoshu Li
Zihan Wang
Zihui Gu
Zilin Li
Ziwei Xie
    MoE
ArXivPDFHTML

Papers citing "DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model"

8 / 108 papers shown
Title
Challenges in Deploying Long-Context Transformers: A Theoretical Peak
  Performance Analysis
Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis
Yao Fu
35
19
0
14 May 2024
Simple and Scalable Strategies to Continually Pre-train Large Language
  Models
Simple and Scalable Strategies to Continually Pre-train Large Language Models
Adam Ibrahim
Benjamin Thérien
Kshitij Gupta
Mats L. Richter
Quentin Anthony
Timothée Lesort
Eugene Belilovsky
Irina Rish
KELM
CLL
44
54
0
13 Mar 2024
COMAE: COMprehensive Attribute Exploration for Zero-shot Hashing
COMAE: COMprehensive Attribute Exploration for Zero-shot Hashing
Yuqi Li
Qingqing Long
Yihang Zhou
Ran Zhang
Zhiyuan Ning
Zhihong Zhu
Yuanchun Zhou
Xuezhi Wang
Meng Xiao
VLM
54
3
0
26 Feb 2024
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
Keisuke Kamahori
Tian Tang
Yile Gu
Kan Zhu
Baris Kasikci
71
20
0
10 Feb 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
139
309
0
05 Jan 2024
AlignBench: Benchmarking Chinese Alignment of Large Language Models
AlignBench: Benchmarking Chinese Alignment of Large Language Models
Xiao Liu
Xuanyu Lei
Sheng-Ping Wang
Yue Huang
Zhuoer Feng
...
Hongning Wang
Jing Zhang
Minlie Huang
Yuxiao Dong
Jie Tang
ELM
LM&MA
ALM
125
43
0
30 Nov 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
363
12,003
0
04 Mar 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
282
1,996
0
31 Dec 2020
Previous
123