ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.04014
  4. Cited By
NVIDIA Tensor Core Programmability, Performance & Precision

NVIDIA Tensor Core Programmability, Performance & Precision

11 March 2018
Stefano Markidis
Steven W. D. Chien
Erwin Laure
Ivy Bo Peng
Jeffrey S. Vetter
ArXiv (abs)PDFHTML

Papers citing "NVIDIA Tensor Core Programmability, Performance & Precision"

11 / 11 papers shown
Title
Hexcute: A Tile-based Programming Language with Automatic Layout and Task-Mapping Synthesis
Hexcute: A Tile-based Programming Language with Automatic Layout and Task-Mapping Synthesis
Xinsong Zhang
Yaoyao Ding
Yang Hu
Gennady Pekhimenko
113
0
0
22 Apr 2025
Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via Tensorization
Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via Tensorization
Zhenyu Liang
Hao Li
Naiwei Yu
Kebin Sun
Ran Cheng
117
1
0
26 Mar 2025
Semiring Activation in Neural Networks
Semiring Activation in Neural Networks
B. Smets
Peter D. Donker
Jim W. Portegies
LLMSV
66
0
0
29 May 2024
Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep
  Neural Networks
Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks
Urs Koster
T. Webb
Xin Eric Wang
Marcel Nassar
Arjun K. Bansal
...
Luke Hornof
A. Khosrowshahi
Carey Kloss
Ruby J. Pai
N. Rao
MQ
47
262
0
06 Nov 2017
Mixed Precision Training
Mixed Precision Training
Paulius Micikevicius
Sharan Narang
Jonah Alben
G. Diamos
Erich Elsen
...
Boris Ginsburg
Michael Houston
Oleksii Kuchaiev
Ganesh Venkatesh
Hao Wu
176
1,805
0
10 Oct 2017
On the Strong Scaling of the Spectral Element Solver Nek5000 on
  Petascale Systems
On the Strong Scaling of the Spectral Element Solver Nek5000 on Petascale Systems
N. Offermans
O. Marin
Michel Schanen
Jing Gong
Paul F. Fischer
P. Schlatter
Aleks Obabko
Adam Peplinksi
Maxwell Hutchinson
Elia Merzari
LRM
22
80
0
09 Jun 2017
In-Datacenter Performance Analysis of a Tensor Processing Unit
In-Datacenter Performance Analysis of a Tensor Processing Unit
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
...
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
237
4,644
0
16 Apr 2017
TensorFlow: A system for large-scale machine learning
TensorFlow: A system for large-scale machine learning
Martín Abadi
P. Barham
Jianmin Chen
Zhiwen Chen
Andy Davis
...
Vijay Vasudevan
Pete Warden
Martin Wicke
Yuan Yu
Xiaoqiang Zhang
GNNAI4CE
433
18,361
0
27 May 2016
Deep Learning with Limited Numerical Precision
Deep Learning with Limited Numerical Precision
Suyog Gupta
A. Agrawal
K. Gopalakrishnan
P. Narayanan
HAI
207
2,049
0
09 Feb 2015
Training deep neural networks with low precision multiplications
Training deep neural networks with low precision multiplications
Matthieu Courbariaux
Yoshua Bengio
J. David
MQ
81
49
0
22 Dec 2014
cuDNN: Efficient Primitives for Deep Learning
cuDNN: Efficient Primitives for Deep Learning
Sharan Chetlur
Cliff Woolley
Philippe Vandermersch
Jonathan M. Cohen
J. Tran
Bryan Catanzaro
Evan Shelhamer
140
1,850
0
03 Oct 2014
1