ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.05198
  4. Cited By
Reducing Activation Recomputation in Large Transformer Models

Reducing Activation Recomputation in Large Transformer Models

10 May 2022
V. Korthikanti
Jared Casper
Sangkug Lym
Lawrence C. McAfee
M. Andersch
M. Shoeybi
Bryan Catanzaro
    AI4CE
ArXivPDFHTML

Papers citing "Reducing Activation Recomputation in Large Transformer Models"

50 / 167 papers shown
Title
World Model on Million-Length Video And Language With Blockwise RingAttention
World Model on Million-Length Video And Language With Blockwise RingAttention
Hao Liu
Wilson Yan
Matei A. Zaharia
Pieter Abbeel
VGen
39
63
0
13 Feb 2024
ZeroPP: Unleashing Exceptional Parallelism Efficiency through
  Tensor-Parallelism-Free Methodology
ZeroPP: Unleashing Exceptional Parallelism Efficiency through Tensor-Parallelism-Free Methodology
Ding Tang
Lijuan Jiang
Jiecheng Zhou
Minxi Jin
Hengjie Li
Xingcheng Zhang
Zhiling Pei
Jidong Zhai
62
3
0
06 Feb 2024
EE-Tuning: An Economical yet Scalable Solution for Tuning Early-Exit
  Large Language Models
EE-Tuning: An Economical yet Scalable Solution for Tuning Early-Exit Large Language Models
Xuchen Pan
Yanxi Chen
Yaliang Li
Bolin Ding
Jingren Zhou
20
8
0
01 Feb 2024
DeepSeek-Coder: When the Large Language Model Meets Programming -- The
  Rise of Code Intelligence
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Daya Guo
Qihao Zhu
Dejian Yang
Zhenda Xie
Kai Dong
...
Yu-Huan Wu
Y. K. Li
Fuli Luo
Yingfei Xiong
W. Liang
ELM
62
687
0
25 Jan 2024
PartIR: Composing SPMD Partitioning Strategies for Machine Learning
PartIR: Composing SPMD Partitioning Strategies for Machine Learning
Sami Alabed
Daniel Belov
Bart Chrzaszcz
Juliana Franco
Dominik Grewe
...
Michael Schaarschmidt
Timur Sitdikov
Agnieszka Swietlik
Dimitrios Vytiniotis
Joel Wee
33
3
0
20 Jan 2024
InternEvo: Efficient Long-sequence Large Language Model Training via
  Hybrid Parallelism and Redundant Sharding
InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding
Qiaoling Chen
Diandian Gu
Guoteng Wang
Xun Chen
Yingtong Xiong
...
Qi Hu
Xin Jin
Yonggang Wen
Tianwei Zhang
Peng Sun
57
8
0
17 Jan 2024
DeepSeekMoE: Towards Ultimate Expert Specialization in
  Mixture-of-Experts Language Models
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Damai Dai
Chengqi Deng
Chenggang Zhao
R. X. Xu
Huazuo Gao
...
Panpan Huang
Fuli Luo
Chong Ruan
Zhifang Sui
W. Liang
MoE
46
252
0
11 Jan 2024
TeleChat Technical Report
TeleChat Technical Report
Zhongjiang He
Zihan Wang
Xinzhan Liu
Shixuan Liu
Yitong Yao
...
Zilu Huang
Sishi Xiong
Yuxiang Zhang
Chao Wang
Shuangyong Song
AI4MH
LRM
ALM
66
3
0
08 Jan 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
139
314
0
05 Jan 2024
Re-evaluating the Memory-balanced Pipeline Parallelism: BPipe
Re-evaluating the Memory-balanced Pipeline Parallelism: BPipe
Mincong Huang
Chao Wang
Chi Ma
Yineng Zhang
Peng Zhang
Lei Yu
33
1
0
04 Jan 2024
Understanding LLMs: A Comprehensive Overview from Training to Inference
Understanding LLMs: A Comprehensive Overview from Training to Inference
Yi-Hsueh Liu
Haoyang He
Tianle Han
Xu-Yao Zhang
Mengyuan Liu
...
Xintao Hu
Tuo Zhang
Ning Qiang
Tianming Liu
Bao Ge
SyDa
35
65
0
04 Jan 2024
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Nikhil Sardana
Jacob P. Portes
Sasha Doubov
Jonathan Frankle
LRM
251
73
0
31 Dec 2023
Unicron: Economizing Self-Healing LLM Training at Scale
Unicron: Economizing Self-Healing LLM Training at Scale
Tao He
Xue Li
Zhibin Wang
Kun Qian
Jingbo Xu
Wenyuan Yu
Jingren Zhou
27
15
0
30 Dec 2023
MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining
MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining
Jacob P. Portes
Alex Trott
Sam Havens
Daniel King
Abhinav Venigalla
Moin Nadeem
Nikhil Sardana
D. Khudia
Jonathan Frankle
28
17
0
29 Dec 2023
Understanding the Potential of FPGA-Based Spatial Acceleration for Large
  Language Model Inference
Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference
Hongzheng Chen
Jiahao Zhang
Yixiao Du
Shaojie Xiang
Zichao Yue
Niansong Zhang
Yaohui Cai
Zhiru Zhang
65
35
0
23 Dec 2023
Towards Message Brokers for Generative AI: Survey, Challenges, and
  Opportunities
Towards Message Brokers for Generative AI: Survey, Challenges, and Opportunities
Alaa Saleh
Roberto Morabito
Sasu Tarkoma
Susanna Pirttikangas
Lauri Lovén
58
3
0
22 Dec 2023
Zebra: Extending Context Window with Layerwise Grouped Local-Global
  Attention
Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention
Kaiqiang Song
Xiaoyang Wang
Sangwoo Cho
Xiaoman Pan
Dong Yu
36
7
0
14 Dec 2023
Stateful Large Language Model Serving with Pensieve
Stateful Large Language Model Serving with Pensieve
Lingfan Yu
Jinyang Li
RALM
KELM
LLMAG
44
12
0
09 Dec 2023
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language
  Models with 3D Parallelism
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism
Yanxi Chen
Xuchen Pan
Yaliang Li
Bolin Ding
Jingren Zhou
LRM
41
30
0
08 Dec 2023
ASPEN: High-Throughput LoRA Fine-Tuning of Large Language Models with a
  Single GPU
ASPEN: High-Throughput LoRA Fine-Tuning of Large Language Models with a Single GPU
Zhengmao Ye
Dengchun Li
Jingqi Tian
Tingfeng Lan
Jie Zuo
...
Hui Lu
Yexi Jiang
Jian Sha
Ke Zhang
Mingjie Tang
102
5
0
05 Dec 2023
From Beginner to Expert: Modeling Medical Knowledge into General LLMs
From Beginner to Expert: Modeling Medical Knowledge into General LLMs
Qiang Li
Xiaoyan Yang
Haowen Wang
Qin Wang
Lei Liu
...
Wangshu Zhang
Teng Xu
Jinjie Gu
Jing Zheng
Guannan Zhang
LM&MA
ELM
AI4MH
29
14
0
02 Dec 2023
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
Tianyu Ding
Tianyi Chen
Haidong Zhu
Jiachen Jiang
Yiqi Zhong
Jinxin Zhou
Guangzhi Wang
Zhihui Zhu
Ilya Zharkov
Luming Liang
29
22
0
01 Dec 2023
Zero Bubble Pipeline Parallelism
Zero Bubble Pipeline Parallelism
Penghui Qi
Xinyi Wan
Guangxing Huang
Min Lin
35
24
0
30 Nov 2023
Zero-shot Conversational Summarization Evaluations with small Large
  Language Models
Zero-shot Conversational Summarization Evaluations with small Large Language Models
R. Manuvinakurike
Saurav Sahay
Sangeeta Manepalli
L. Nachman
ELM
LM&MA
30
0
0
29 Nov 2023
Tessel: Boosting Distributed Execution of Large DNN Models via Flexible
  Schedule Search
Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search
Zhiqi Lin
Youshan Miao
Guanbin Xu
Cheng Li
Olli Saarikivi
Saeed Maleki
Fan Yang
25
6
0
26 Nov 2023
Striped Attention: Faster Ring Attention for Causal Transformers
Striped Attention: Faster Ring Attention for Causal Transformers
William Brandon
Aniruddha Nrusimha
Kevin Qian
Zack Ankner
Tian Jin
Zhiye Song
Jonathan Ragan-Kelley
24
36
0
15 Nov 2023
Efficient Parallelization Layouts for Large-Scale Distributed Model
  Training
Efficient Parallelization Layouts for Large-Scale Distributed Model Training
Johannes Hagemann
Samuel Weinbach
Konstantin Dobler
Maximilian Schall
Gerard de Melo
LRM
45
6
0
09 Nov 2023
Just-in-time Quantization with Processing-In-Memory for Efficient ML
  Training
Just-in-time Quantization with Processing-In-Memory for Efficient ML Training
M. Ibrahim
Shaizeen Aga
Ada Li
Suchita Pati
Mahzabeen Islam
34
3
0
08 Nov 2023
Dissecting the Runtime Performance of the Training, Fine-tuning, and
  Inference of Large Language Models
Dissecting the Runtime Performance of the Training, Fine-tuning, and Inference of Large Language Models
Longteng Zhang
Xiang Liu
Zeyu Li
Xinglin Pan
Peijie Dong
...
Rui Guo
Xin Wang
Qiong Luo
S. Shi
Xiaowen Chu
49
7
0
07 Nov 2023
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Ying Sheng
Shiyi Cao
Dacheng Li
Coleman Hooper
Nicholas Lee
...
Banghua Zhu
Lianmin Zheng
Kurt Keutzer
Joseph E. Gonzalez
Ion Stoica
MoE
28
90
0
06 Nov 2023
Ultra-Long Sequence Distributed Transformer
Ultra-Long Sequence Distributed Transformer
Xiao Wang
Isaac Lyngaas
A. Tsaris
Peng Chen
Sajal Dash
Mayanka Chandra Shekar
Tao Luo
Hong-Jun Yoon
Mohamed Wahib
John P. Gounley
35
4
0
04 Nov 2023
Coop: Memory is not a Commodity
Coop: Memory is not a Commodity
Jianhao Zhang
Shihan Ma
Peihong Liu
Jinhui Yuan
41
4
0
01 Nov 2023
Interactive Multi-fidelity Learning for Cost-effective Adaptation of
  Language Model with Sparse Human Supervision
Interactive Multi-fidelity Learning for Cost-effective Adaptation of Language Model with Sparse Human Supervision
Jiaxin Zhang
Zhuohang Li
Kamalika Das
Kumar Sricharan
40
2
0
31 Oct 2023
Skywork: A More Open Bilingual Foundation Model
Skywork: A More Open Bilingual Foundation Model
Tianwen Wei
Liang Zhao
Lichang Zhang
Bo Zhu
Lijie Wang
...
Yongyi Peng
Xiaojuan Liang
Shuicheng Yan
Han Fang
Yahui Zhou
38
93
0
30 Oct 2023
TRANSOM: An Efficient Fault-Tolerant System for Training LLMs
TRANSOM: An Efficient Fault-Tolerant System for Training LLMs
Baodong Wu
Lei Xia
Qingping Li
Kangyu Li
Xu Chen
Yongqiang Guo
Tieyao Xiang
Yuheng Chen
Shigang Li
40
11
0
16 Oct 2023
BC4LLM: Trusted Artificial Intelligence When Blockchain Meets Large
  Language Models
BC4LLM: Trusted Artificial Intelligence When Blockchain Meets Large Language Models
Haoxiang Luo
Jian Luo
Athanasios V. Vasilakos
39
9
0
10 Oct 2023
Rethinking Memory and Communication Cost for Efficient Large Language
  Model Training
Rethinking Memory and Communication Cost for Efficient Large Language Model Training
Chan Wu
Hanxiao Zhang
Lin Ju
Jinjing Huang
Youshao Xiao
...
Siyuan Li
Fanzhuang Meng
Lei Liang
Xiaolu Zhang
Jun Zhou
26
4
0
09 Oct 2023
Generative Spoken Language Model based on continuous word-sized audio
  tokens
Generative Spoken Language Model based on continuous word-sized audio tokens
Robin Algayres
Yossi Adi
Tu Nguyen
Jade Copet
Gabriel Synnaeve
Benoît Sagot
Emmanuel Dupoux
AuLLM
46
12
0
08 Oct 2023
Scaling Laws of RoPE-based Extrapolation
Scaling Laws of RoPE-based Extrapolation
Xiaoran Liu
Hang Yan
Shuo Zhang
Chen An
Xipeng Qiu
Dahua Lin
28
83
0
08 Oct 2023
DISTFLASHATTN: Distributed Memory-efficient Attention for Long-context
  LLMs Training
DISTFLASHATTN: Distributed Memory-efficient Attention for Long-context LLMs Training
Dacheng Li
Rulin Shao
Anze Xie
Eric P. Xing
Xuezhe Ma
Ion Stoica
Joseph E. Gonzalez
Hao Zhang
34
18
0
05 Oct 2023
Ring Attention with Blockwise Transformers for Near-Infinite Context
Ring Attention with Blockwise Transformers for Near-Infinite Context
Hao Liu
Matei A. Zaharia
Pieter Abbeel
46
218
0
03 Oct 2023
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme
  Long Sequence Transformer Models
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
S. A. Jacobs
Masahiro Tanaka
Chengming Zhang
Minjia Zhang
L. Song
Samyam Rajbhandari
Yuxiong He
25
104
0
25 Sep 2023
FLM-101B: An Open LLM and How to Train It with $100K Budget
FLM-101B: An Open LLM and How to Train It with 100KBudget100K Budget100KBudget
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Xuying Meng
...
Li Du
Bowen Qin
Zheng-Wei Zhang
Aixin Sun
Yequan Wang
60
21
0
07 Sep 2023
Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on
  Language, Multimodal, and Scientific GPT Models
Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models
Kaiyuan Gao
Su He
Zhenyu He
Jiacheng Lin
Qizhi Pei
Jie Shao
Wei Zhang
LM&MA
SyDa
38
4
0
27 Aug 2023
LoRA-FA: Memory-efficient Low-rank Adaptation for Large Language Models
  Fine-tuning
LoRA-FA: Memory-efficient Low-rank Adaptation for Large Language Models Fine-tuning
Longteng Zhang
Lin Zhang
S. Shi
Xiaowen Chu
Bo-wen Li
AI4CE
18
92
0
07 Aug 2023
PROV-IO+: A Cross-Platform Provenance Framework for Scientific Data on
  HPC Systems
PROV-IO+: A Cross-Platform Provenance Framework for Scientific Data on HPC Systems
Runzhou Han
Mai Zheng
S. Byna
Houjun Tang
Bin Dong
...
Yong Chen
Dongkyun Kim
Joseph Hassoun
D. Thorsley
Matthew Wolf
34
2
0
02 Aug 2023
A Comprehensive Overview of Large Language Models
A Comprehensive Overview of Large Language Models
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Ajmal Mian
OffRL
70
538
0
12 Jul 2023
LongNet: Scaling Transformers to 1,000,000,000 Tokens
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Jiayu Ding
Shuming Ma
Li Dong
Xingxing Zhang
Shaohan Huang
Wenhui Wang
Nanning Zheng
Furu Wei
CLL
41
152
0
05 Jul 2023
FedJETs: Efficient Just-In-Time Personalization with Federated Mixture
  of Experts
FedJETs: Efficient Just-In-Time Personalization with Federated Mixture of Experts
Chen Dun
Mirian Hipolito Garcia
Guoqing Zheng
Ahmed Hassan Awadallah
Robert Sim
Anastasios Kyrillidis
Dimitrios Dimitriadis
FedML
MoE
32
6
0
14 Jun 2023
Blockwise Parallel Transformer for Large Context Models
Blockwise Parallel Transformer for Large Context Models
Hao Liu
Pieter Abbeel
49
11
0
30 May 2023
Previous
1234
Next