Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance

7 March 2022

Papers citing "Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance"

7 / 7 papers shown

Title
Reducing shared memory footprint to leverage high throughput on Tensor Cores and its flexible API extension library Hiroyuki Ootomo Rio Yokota 13 7 0 29 Aug 2023
Generative Artificial Intelligence Reproducibility and Consensus Edward J. Kim I. Isozaki N. Sirkin Michael Robson 36 0 0 04 Jul 2023
DGEMM on Integer Matrix Multiplication Unit Hiroyuki Ootomo K. Ozaki Rio Yokota 19 12 0 21 Jun 2023
Mixed-Precision Random Projection for RandNLA on Tensor Cores Hiroyuki Ootomo Rio Yokota 19 3 0 10 Apr 2023
Quantum Circuit Simulation by SGEMM Emulation on Tensor Cores and Automatic Precision Selection Hiryuki Ootomo Hidetaka Manabe K. Harada Rio Yokota 18 5 0 15 Mar 2023
Myths and Legends in High-Performance Computing Satoshi Matsuoka Jens Domke Mohamed Wahib Aleksandr Drozd Torsten Hoefler 35 14 0 06 Jan 2023
POAS: A high-performance scheduling framework for exploiting Accelerator Level Parallelism Pablo Antonio Martínez Gregorio Bernabé J. M. García 16 1 0 21 Sep 2022