ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.04760
  4. Cited By
In-Datacenter Performance Analysis of a Tensor Processing Unit

In-Datacenter Performance Analysis of a Tensor Processing Unit

16 April 2017
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh Bhatia
Nan Boden
Al Borchers
Rick Boyle
Pierre-luc Cantin
Clifford Chao
Chris Clark
Jeremy Coriell
Mike Daley
Matt Dau
Jeffrey Dean
Ben Gelb
Taraneh Ghaemmaghami
Rajendra Gottipati
William Gulland
Robert Hagmann
C. Richard Ho
Doug Hogberg
John Hu
R. Hundt
Dan Hurt
Julian Ibarz
A. Jaffey
Alek Jaworski
Alexander Kaplan
Harshit Khaitan
Andy Koch
Naveen Kumar
Steve Lacy
James Laudon
James Law
Diemthu Le
Chris Leary
Zhuyuan Liu
Kyle Lucke
Alan Lundin
Gordon MacKean
Adriana Maggiore
Maire Mahony
Kieran Miller
R. Nagarajan
Ravi Narayanaswami
Ray Ni
Kathy Nix
Thomas Norrie
Mark Omernick
Narayana Penukonda
Andy Phelps
Jonathan Ross
Matt Ross
Amir Salek
Emad Samadiani
Chris Severn
Gregory Sizikov
Matthew Snelham
Jed Souter
Dan Steinberg
Andy Swing
Mercedes Tan
Gregory Thorson
Bo Tian
Horia Toma
Erick Tuttle
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
ArXivPDFHTML

Papers citing "In-Datacenter Performance Analysis of a Tensor Processing Unit"

50 / 1,165 papers shown
Title
Training Compute-Optimal Large Language Models
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
69
1,862
0
29 Mar 2022
Stack operation of tensor networks
Stack operation of tensor networks
Tianning Zhang
Tianqi Chen
Erping Li
Bo Yang
L. Ang
GNN
15
2
0
28 Mar 2022
Discovering dynamical features of Hodgkin-Huxley-type model of
  physiological neuron using artificial neural network
Discovering dynamical features of Hodgkin-Huxley-type model of physiological neuron using artificial neural network
P. V. Kuptsov
N. Stankevich
Elmira Bagautdinova
16
3
0
26 Mar 2022
TCN Mapping Optimization for Ultra-Low Power Time-Series Edge Inference
TCN Mapping Optimization for Ultra-Low Power Time-Series Edge Inference
Luca Bompani
Alberto Dequino
Daniele Jahier Pagliari
Francesco Conti
Marcello Zanghieri
Enrico Macii
Luca Benini
Massimo Poncino
AI4TS
49
7
0
24 Mar 2022
U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture
  Search
U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture Search
A. C. Yüzügüler
Nikolaos Dimitriadis
P. Frossard
19
2
0
23 Mar 2022
Graph Neural Networks in Particle Physics: Implementations, Innovations,
  and Challenges
Graph Neural Networks in Particle Physics: Implementations, Innovations, and Challenges
S. Thais
P. Calafiura
G. Chachamis
G. Dezoort
Javier Mauricio Duarte
S. Ganguly
Michael Kagan
D. Murnane
Mark S. Neubauer
K. Terao
PINN
AI4CE
60
30
0
23 Mar 2022
NNReArch: A Tensor Program Scheduling Framework Against Neural Network
  Architecture Reverse Engineering
NNReArch: A Tensor Program Scheduling Framework Against Neural Network Architecture Reverse Engineering
Yukui Luo
Shijin Duan
Gongye Cheng
Yunsi Fei
Xiaolin Xu
21
8
0
22 Mar 2022
Scale-out Systolic Arrays
Scale-out Systolic Arrays
A. C. Yüzügüler
Canberk Sönmez
M. Drumond
Yunho Oh
Babak Falsafi
P. Frossard
19
17
0
22 Mar 2022
Hardware Approximate Techniques for Deep Neural Network Accelerators: A
  Survey
Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey
Giorgos Armeniakos
Georgios Zervakis
Dimitrios Soudris
J. Henkel
222
94
0
16 Mar 2022
Hercules: Heterogeneity-Aware Inference Serving for At-Scale
  Personalized Recommendation
Hercules: Heterogeneity-Aware Inference Serving for At-Scale Personalized Recommendation
Liu Ke
Udit Gupta
Mark Hempstead
Carole-Jean Wu
Hsien-Hsin S. Lee
Xuan Zhang
31
21
0
14 Mar 2022
Energy-Latency Attacks via Sponge Poisoning
Energy-Latency Attacks via Sponge Poisoning
Antonio Emanuele Cinà
Ambra Demontis
Battista Biggio
Fabio Roli
Marcello Pelillo
SILM
75
29
0
14 Mar 2022
FlexBlock: A Flexible DNN Training Accelerator with Multi-Mode Block
  Floating Point Support
FlexBlock: A Flexible DNN Training Accelerator with Multi-Mode Block Floating Point Support
Seock-Hwan Noh
Jahyun Koo
Seunghyun Lee
Jongse Park
Jaeha Kung
AI4CE
39
17
0
13 Mar 2022
AdaPT: Fast Emulation of Approximate DNN Accelerators in PyTorch
AdaPT: Fast Emulation of Approximate DNN Accelerators in PyTorch
Dimitrios Danopoulos
Georgios Zervakis
K. Siozios
Dimitrios Soudris
J. Henkel
40
32
0
08 Mar 2022
Recovering single precision accuracy from Tensor Cores while surpassing
  the FP32 theoretical peak performance
Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance
Hiroyuki Ootomo
Rio Yokota
27
32
0
07 Mar 2022
Structured Pruning is All You Need for Pruning CNNs at Initialization
Structured Pruning is All You Need for Pruning CNNs at Initialization
Yaohui Cai
Weizhe Hua
Hongzheng Chen
G. E. Suh
Christopher De Sa
Zhiru Zhang
CVBM
49
14
0
04 Mar 2022
Reinventing High Performance Computing: Challenges and Opportunities
Reinventing High Performance Computing: Challenges and Opportunities
Daniel Reed
Dennis Gannon
Jack J. Dongarra
AILaw
22
30
0
04 Mar 2022
Query Processing on Tensor Computation Runtimes
Query Processing on Tensor Computation Runtimes
Dong He
Supun Nakandala
Dalitso Banda
Rathijit Sen
Karla Saur
Kwanghyun Park
Carlo Curino
Jesús Camacho-Rodríguez
Konstantinos Karanasos
Matteo Interlandi
32
36
0
03 Mar 2022
FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours
FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours
Shenggan Cheng
Xuanlei Zhao
Guangyang Lu
Bin-Rui Li
Zhongming Yu
Tian Zheng
R. Wu
Xiwen Zhang
Jian Peng
Yang You
AI4CE
27
31
0
02 Mar 2022
PARIS and ELSA: An Elastic Scheduling Algorithm for Reconfigurable
  Multi-GPU Inference Servers
PARIS and ELSA: An Elastic Scheduling Algorithm for Reconfigurable Multi-GPU Inference Servers
Yunseong Kim
Yujeong Choi
Minsoo Rhu
28
15
0
27 Feb 2022
Saving RNN Computations with a Neuron-Level Fuzzy Memoization Scheme
Saving RNN Computations with a Neuron-Level Fuzzy Memoization Scheme
Franyell Silfa
J. Arnau
Antonio González
27
1
0
14 Feb 2022
Blocking Techniques for Sparse Matrix Multiplication on Tensor
  Accelerators
Blocking Techniques for Sparse Matrix Multiplication on Tensor Accelerators
P. S. Labini
M. Bernaschi
Francesco Silvestri
Flavio Vella
25
3
0
11 Feb 2022
Energy awareness in low precision neural networks
Energy awareness in low precision neural networks
Nurit Spingarn-Eliezer
Ron Banner
Elad Hoffer
Hilla Ben-Yaacov
T. Michaeli
62
0
0
06 Feb 2022
EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network
  Accelerators
EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network Accelerators
Lois Orosa
Skanda Koppula
Yaman Umuroglu
Konstantinos Kanellopoulos
Juan Gómez Luna
Michaela Blott
K. Vissers
O. Mutlu
48
4
0
04 Feb 2022
Neural Network Training with Asymmetric Crosspoint Elements
Neural Network Training with Asymmetric Crosspoint Elements
M. Onen
Tayfun Gokmen
T. Todorov
T. Nowicki
Jesús A. del Alamo
J. Rozen
W. Haensch
Seyoung Kim
39
19
0
31 Jan 2022
ScaLA: Accelerating Adaptation of Pre-Trained Transformer-Based Language
  Models via Efficient Large-Batch Adversarial Noise
ScaLA: Accelerating Adaptation of Pre-Trained Transformer-Based Language Models via Efficient Large-Batch Adversarial Noise
Minjia Zhang
U. Niranjan
Yuxiong He
33
1
0
29 Jan 2022
Flashlight: Enabling Innovation in Tools for Machine Learning
Flashlight: Enabling Innovation in Tools for Machine Learning
Jacob Kahn
Vineel Pratap
Tatiana Likhomanenko
Qiantong Xu
Awni Y. Hannun
...
Gilad Avidov
Benoit Steiner
Vitaliy Liptchinsky
Gabriel Synnaeve
R. Collobert
32
28
0
29 Jan 2022
RecShard: Statistical Feature-Based Memory Optimization for
  Industry-Scale Neural Recommendation
RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation
Geet Sethi
Bilge Acun
Niket Agarwal
Christos Kozyrakis
Caroline Trippel
Carole-Jean Wu
52
67
0
25 Jan 2022
Description-Driven Task-Oriented Dialog Modeling
Description-Driven Task-Oriented Dialog Modeling
Jeffrey Zhao
Raghav Gupta
Yuan Cao
Dian Yu
Mingqiu Wang
Harrison Lee
Abhinav Rastogi
Izhak Shafran
Yonghui Wu
53
64
0
21 Jan 2022
APack: Off-Chip, Lossless Data Compression for Efficient Deep Learning
  Inference
APack: Off-Chip, Lossless Data Compression for Efficient Deep Learning Inference
Alberto Delmas Lascorz
Mostafa Mahmoud
Andreas Moshovos
MQ
28
1
0
21 Jan 2022
VELTAIR: Towards High-Performance Multi-tenant Deep Learning Services
  via Adaptive Compilation and Scheduling
VELTAIR: Towards High-Performance Multi-tenant Deep Learning Services via Adaptive Compilation and Scheduling
Zihan Liu
Jingwen Leng
Zhihui Zhang
Quan Chen
Chao Li
Minyi Guo
29
46
0
17 Jan 2022
Distributed Evolution Strategies Using TPUs for Meta-Learning
Distributed Evolution Strategies Using TPUs for Meta-Learning
Alex Sheng
J. He
16
2
0
01 Jan 2022
A Survey of Near-Data Processing Architectures for Neural Networks
A Survey of Near-Data Processing Architectures for Neural Networks
M. Hassanpour
Marc Riera
Antonio González
12
10
0
23 Dec 2021
Proving Theorems using Incremental Learning and Hindsight Experience
  Replay
Proving Theorems using Incremental Learning and Hindsight Experience Replay
Eser Aygun
Laurent Orseau
Ankit Anand
Xavier Glorot
Vlad Firoiu
Lei M. Zhang
Doina Precup
Shibl Mourad
CLL
LRM
38
17
0
20 Dec 2021
Torch.fx: Practical Program Capture and Transformation for Deep Learning
  in Python
Torch.fx: Practical Program Capture and Transformation for Deep Learning in Python
James K. Reed
Zach DeVito
Horace He
Ansley Ussery
Jason Ansel
CLIP
36
47
0
15 Dec 2021
Programming with Neural Surrogates of Programs
Programming with Neural Surrogates of Programs
Alex Renda
Yi Ding
Michael Carbin
24
3
0
12 Dec 2021
Automap: Towards Ergonomic Automated Parallelism for ML Models
Automap: Towards Ergonomic Automated Parallelism for ML Models
Michael Schaarschmidt
Dominik Grewe
Dimitrios Vytiniotis
Adam Paszke
G. Schmid
...
James Molloy
Jonathan Godwin
Norman A. Rink
Vinod Nair
Dan Belov
MoE
27
16
0
06 Dec 2021
TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs
TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs
Yuke Wang
Boyuan Feng
Zheng Wang
Guyue Huang
Yufei Ding
GNN
24
28
0
03 Dec 2021
Simplifying heterogeneous migration between x86 and ARM machines
Simplifying heterogeneous migration between x86 and ARM machines
Nikolaos Mavrogeorgis
11
0
0
02 Dec 2021
Memory-efficient array redistribution through portable collective
  communication
Memory-efficient array redistribution through portable collective communication
Norman A. Rink
Adam Paszke
Dimitrios Vytiniotis
G. Schmid
17
4
0
02 Dec 2021
RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from
  Sparse Inputs
RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs
Michael Niemeyer
Jonathan T. Barron
B. Mildenhall
Mehdi S. M. Sajjadi
Andreas Geiger
Noha Radwan
53
582
0
01 Dec 2021
PokeBNN: A Binary Pursuit of Lightweight Accuracy
PokeBNN: A Binary Pursuit of Lightweight Accuracy
Yichi Zhang
Zhiru Zhang
Lukasz Lew
MQ
45
57
0
30 Nov 2021
Urban Radiance Fields
Urban Radiance Fields
Konstantinos Rematas
An Liu
Pratul P. Srinivasan
Jonathan T. Barron
Andrea Tagliasacchi
Thomas Funkhouser
V. Ferrari
21
280
0
29 Nov 2021
A Dense Tensor Accelerator with Data Exchange Mesh for DNN and Vision
  Workloads
A Dense Tensor Accelerator with Data Exchange Mesh for DNN and Vision Workloads
Yu-Sheng Lin
Wei-Chao Chen
Shao-Yi Chien
31
1
0
25 Nov 2021
EH-DNAS: End-to-End Hardware-aware Differentiable Neural Architecture
  Search
EH-DNAS: End-to-End Hardware-aware Differentiable Neural Architecture Search
Qian Jiang
Xiaofan Zhang
Deming Chen
Minh Do
Raymond A. Yeh
30
7
0
24 Nov 2021
Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields
Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields
Jonathan T. Barron
B. Mildenhall
Dor Verbin
Pratul P. Srinivasan
Peter Hedman
54
1,635
0
23 Nov 2021
Combined Scaling for Zero-shot Transfer Learning
Combined Scaling for Zero-shot Transfer Learning
Hieu H. Pham
Zihang Dai
Golnaz Ghiasi
Kenji Kawaguchi
Hanxiao Liu
...
Yi-Ting Chen
Minh-Thang Luong
Yonghui Wu
Mingxing Tan
Quoc V. Le
VLM
17
195
0
19 Nov 2021
A Modular 1D-CNN Architecture for Real-time Digital Pre-distortion
A Modular 1D-CNN Architecture for Real-time Digital Pre-distortion
U. D. Silva
T. Koike-Akino
R. Ma
Ao Yamashita
H. Nakamizo
MQ
15
5
0
18 Nov 2021
Attacking Deep Learning AI Hardware with Universal Adversarial
  Perturbation
Attacking Deep Learning AI Hardware with Universal Adversarial Perturbation
Mehdi Sadi
B. M. S. Bahar Talukder
Kaniz Mishty
Md. Tauhidur Rahman
AAML
39
0
0
18 Nov 2021
Beyond Importance Scores: Interpreting Tabular ML by Visualizing Feature
  Semantics
Beyond Importance Scores: Interpreting Tabular ML by Visualizing Feature Semantics
Amirata Ghorbani
Dina Berenbaum
Maor Ivgi
Yuval Dafna
James Zou
FAtt
28
8
0
10 Nov 2021
On minimizers and convolutional filters: theoretical connections and
  applications to genome analysis
On minimizers and convolutional filters: theoretical connections and applications to genome analysis
YunLong Yu
25
1
0
09 Nov 2021
Previous
123...789...222324
Next