NVIDIA Tensor Core Programmability, Performance & Precision

NVIDIA Tensor Core Programmability, Performance & Precision

11 March 2018

Stefano Markidis

Steven W. D. Chien

Jeffrey S. Vetter

ArXiv (abs)PDF HTML

Papers citing "NVIDIA Tensor Core Programmability, Performance & Precision"

11 / 11 papers shown

Title
Hexcute: A Tile-based Programming Language with Automatic Layout and Task-Mapping Synthesis Xinsong Zhang Yaoyao Ding Yang Hu Gennady Pekhimenko 113 0 0 22 Apr 2025
Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via Tensorization Zhenyu Liang Hao Li Naiwei Yu Kebin Sun Ran Cheng 117 1 0 26 Mar 2025
Semiring Activation in Neural Networks B. Smets Peter D. Donker Jim W. Portegies LLMSV 66 0 0 29 May 2024
Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks Urs Koster T. Webb Xin Eric Wang Marcel Nassar Arjun K. Bansal ... Luke Hornof A. Khosrowshahi Carey Kloss Ruby J. Pai N. Rao MQ 47 262 0 06 Nov 2017
Mixed Precision Training Paulius Micikevicius Sharan Narang Jonah Alben G. Diamos Erich Elsen ... Boris Ginsburg Michael Houston Oleksii Kuchaiev Ganesh Venkatesh Hao Wu 176 1,805 0 10 Oct 2017
On the Strong Scaling of the Spectral Element Solver Nek5000 on Petascale Systems N. Offermans O. Marin Michel Schanen Jing Gong Paul F. Fischer P. Schlatter Aleks Obabko Adam Peplinksi Maxwell Hutchinson Elia Merzari LRM 22 80 0 09 Jun 2017
In-Datacenter Performance Analysis of a Tensor Processing Unit N. Jouppi C. Young Nishant Patil David Patterson Gaurav Agrawal ... Vijay Vasudevan Richard Walter Walter Wang Eric Wilcox Doe Hyun Yoon 237 4,644 0 16 Apr 2017
TensorFlow: A system for large-scale machine learning Martín Abadi P. Barham Jianmin Chen Zhiwen Chen Andy Davis ... Vijay Vasudevan Pete Warden Martin Wicke Yuan Yu Xiaoqiang Zhang GNN AI4CE 433 18,361 0 27 May 2016
Deep Learning with Limited Numerical Precision Suyog Gupta A. Agrawal K. Gopalakrishnan P. Narayanan HAI 207 2,049 0 09 Feb 2015
Training deep neural networks with low precision multiplications Matthieu Courbariaux Yoshua Bengio J. David MQ 81 49 0 22 Dec 2014
cuDNN: Efficient Primitives for Deep Learning Sharan Chetlur Cliff Woolley Philippe Vandermersch Jonathan M. Cohen J. Tran Bryan Catanzaro Evan Shelhamer 140 1,850 0 03 Oct 2014