ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.01024
  4. Cited By
Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs

Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs

1 May 2023
Shixun Wu
Yujia Zhai
Jinyang Liu
Jiajun Huang
Zizhe Jian
Bryan M. Wong
Zizhong Chen
ArXivPDFHTML

Papers citing "Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs"

8 / 8 papers shown
Title
FT-Transformer: Resilient and Reliable Transformer with End-to-End Fault Tolerant Attention
FT-Transformer: Resilient and Reliable Transformer with End-to-End Fault Tolerant Attention
Huangliang Dai
Shixun Wu
Hairui Zhao
Jiajun Huang
Zizhe Jian
Yue Zhu
Haiyang Hu
Zizhong Chen
54
0
0
03 Apr 2025
TurboFFT: Co-Designed High-Performance and Fault-Tolerant Fast Fourier
  Transform on GPUs
TurboFFT: Co-Designed High-Performance and Fault-Tolerant Fast Fourier Transform on GPUs
Shixun Wu
Yujia Zhai
Jinyang Liu
Jiajun Huang
Zizhe Jian
Huangliang Dai
Sheng Di
Franck Cappello
Zizhong Chen
72
2
0
08 Dec 2024
DGRO: Diameter-Guided Ring Optimization for Integrated Research
  Infrastructure Membership
DGRO: Diameter-Guided Ring Optimization for Integrated Research Infrastructure Membership
Shixun Wu
Krishnan Raghavan
Sheng Di
Zizhong Chen
Franck Cappello
26
1
0
14 Oct 2024
FT K-means: A High-Performance K-means on GPU with Fault Tolerance
FT K-means: A High-Performance K-means on GPU with Fault Tolerance
Shixun Wu
Yitong Ding
Yujia Zhai
Jinyang Liu
Jiajun Huang
...
Huangliang Dai
Sheng Di
Bryan M. Wong
Zizhong Chen
Franck Cappello
40
2
0
02 Aug 2024
MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models
MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models
Jingwei Xu
Junyu Lai
Yunpeng Huang
MoE
MoMe
40
9
0
19 May 2024
FT-GEMM: A Fault Tolerant High Performance GEMM Implementation on x86
  CPUs
FT-GEMM: A Fault Tolerant High Performance GEMM Implementation on x86 CPUs
Shixun Wu
Yujia Zhai
Jiajun Huang
Zizhe Jian
Zizhong Chen
FedML
14
6
0
03 May 2023
ApproxABFT: Approximate Algorithm-Based Fault Tolerance for Neural Network Processing
ApproxABFT: Approximate Algorithm-Based Fault Tolerance for Neural Network Processing
Xing-xiong Xue
Cheng Liu
Haitong Huang
Bo Liu
Ying Wang
36
0
0
21 Feb 2023
ByteTransformer: A High-Performance Transformer Boosted for
  Variable-Length Inputs
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Yujia Zhai
Chengquan Jiang
Leyuan Wang
Xiaoying Jia
Shang Zhang
Zizhong Chen
Xin Liu
Yibo Zhu
62
48
0
06 Oct 2022
1