ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.04760
  4. Cited By
In-Datacenter Performance Analysis of a Tensor Processing Unit

In-Datacenter Performance Analysis of a Tensor Processing Unit

16 April 2017
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh Bhatia
Nan Boden
Al Borchers
Rick Boyle
Pierre-luc Cantin
Clifford Chao
Chris Clark
Jeremy Coriell
Mike Daley
Matt Dau
Jeffrey Dean
Ben Gelb
Taraneh Ghaemmaghami
Rajendra Gottipati
William Gulland
Robert Hagmann
C. Richard Ho
Doug Hogberg
John Hu
R. Hundt
Dan Hurt
Julian Ibarz
A. Jaffey
Alek Jaworski
Alexander Kaplan
Harshit Khaitan
Andy Koch
Naveen Kumar
Steve Lacy
James Laudon
James Law
Diemthu Le
Chris Leary
Zhuyuan Liu
Kyle Lucke
Alan Lundin
Gordon MacKean
Adriana Maggiore
Maire Mahony
Kieran Miller
R. Nagarajan
Ravi Narayanaswami
Ray Ni
Kathy Nix
Thomas Norrie
Mark Omernick
Narayana Penukonda
Andy Phelps
Jonathan Ross
Matt Ross
Amir Salek
Emad Samadiani
Chris Severn
Gregory Sizikov
Matthew Snelham
Jed Souter
Dan Steinberg
Andy Swing
Mercedes Tan
Gregory Thorson
Bo Tian
Horia Toma
Erick Tuttle
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
ArXivPDFHTML

Papers citing "In-Datacenter Performance Analysis of a Tensor Processing Unit"

50 / 1,165 papers shown
Title
DRACO: Co-Optimizing Hardware Utilization, and Performance of DNNs on
  Systolic Accelerator
DRACO: Co-Optimizing Hardware Utilization, and Performance of DNNs on Systolic Accelerator
N. Jha
Shreyas Ravishankar
Sparsh Mittal
Arvind Kaushik
D. Mandal
M. Chandra
14
9
0
26 Jun 2020
On the Difficulty of Designing Processor Arrays for Deep Neural Networks
On the Difficulty of Designing Processor Arrays for Deep Neural Networks
Kevin Stehle
Günther Schindler
Holger Fröning
12
0
0
24 Jun 2020
Inference with Artificial Neural Networks on Analog Neuromorphic
  Hardware
Inference with Artificial Neural Networks on Analog Neuromorphic Hardware
Johannes Weis
Philipp Spilger
Sebastian Billaudelle
Yannik Stradmann
Arne Emmel
...
V. Karasenko
Mitja Kleider
Christian Mauch
Korbinian Schreiber
Johannes Schemmel
27
10
0
23 Jun 2020
Similarity Search with Tensor Core Units
Similarity Search with Tensor Core Units
Thomas Dybdahl Ahle
Francesco Silvestri
13
8
0
22 Jun 2020
Quantum Computing Methods for Supervised Learning
Quantum Computing Methods for Supervised Learning
V. Kulkarni
Milind Kulkarni
Aniruddha Pant
14
29
0
22 Jun 2020
Evaluating Prediction-Time Batch Normalization for Robustness under
  Covariate Shift
Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift
Zachary Nado
Shreyas Padhy
D. Sculley
Alexander DÁmour
Balaji Lakshminarayanan
Jasper Snoek
OOD
AI4TS
43
240
0
19 Jun 2020
Caffe Barista: Brewing Caffe with FPGAs in the Training Loop
Caffe Barista: Brewing Caffe with FPGAs in the Training Loop
D. A. Vink
A. Rajagopal
Stylianos I. Venieris
C. Bouganis
BDL
11
7
0
18 Jun 2020
A Review of 1D Convolutional Neural Networks toward Unknown Substance
  Identification in Portable Raman Spectrometer
A Review of 1D Convolutional Neural Networks toward Unknown Substance Identification in Portable Raman Spectrometer
Mohammad Mozaffari
L. Tay
27
21
0
18 Jun 2020
Efficient Execution of Quantized Deep Learning Models: A Compiler
  Approach
Efficient Execution of Quantized Deep Learning Models: A Compiler Approach
Animesh Jain
Shoubhik Bhattacharya
Masahiro Masuda
Vin Sharma
Yida Wang
MQ
27
33
0
18 Jun 2020
Dynamic Tensor Rematerialization
Dynamic Tensor Rematerialization
Marisa Kirisame
Steven Lyubomirsky
Altan Haan
Jennifer Brennan
Mike He
Jared Roesch
Tianqi Chen
Zachary Tatlock
29
93
0
17 Jun 2020
Memory-Efficient Pipeline-Parallel DNN Training
Memory-Efficient Pipeline-Parallel DNN Training
Deepak Narayanan
Amar Phanishayee
Kaiyu Shi
Xie Chen
Matei A. Zaharia
MoE
45
212
0
16 Jun 2020
Logically Synthesized, Hardware-Accelerated, Restricted Boltzmann
  Machines for Combinatorial Optimization and Integer Factorization
Logically Synthesized, Hardware-Accelerated, Restricted Boltzmann Machines for Combinatorial Optimization and Integer Factorization
Saavan Patel
Philip Canoza
Sayeef Salahuddin
6
34
0
16 Jun 2020
Multi-Precision Policy Enforced Training (MuPPET): A precision-switching
  strategy for quantised fixed-point training of CNNs
Multi-Precision Policy Enforced Training (MuPPET): A precision-switching strategy for quantised fixed-point training of CNNs
A. Rajagopal
D. A. Vink
Stylianos I. Venieris
C. Bouganis
MQ
21
14
0
16 Jun 2020
SECure: A Social and Environmental Certificate for AI Systems
SECure: A Social and Environmental Certificate for AI Systems
Abhishek Gupta
Camylle Lanteigne
Sara Kingsley
13
13
0
11 Jun 2020
STONNE: A Detailed Architectural Simulator for Flexible Neural Network
  Accelerators
STONNE: A Detailed Architectural Simulator for Flexible Neural Network Accelerators
Francisco Munoz-Martínez
José L. Abellán
M. Acacio
T. Krishna
32
11
0
10 Jun 2020
Making Convolutions Resilient via Algorithm-Based Error Detection
  Techniques
Making Convolutions Resilient via Algorithm-Based Error Detection Techniques
S. Hari
Michael B. Sullivan
Timothy Tsai
S. Keckler
29
67
0
08 Jun 2020
EDCompress: Energy-Aware Model Compression for Dataflows
EDCompress: Energy-Aware Model Compression for Dataflows
Zhehui Wang
Yaoyu Zhang
Qiufeng Wang
Rick Siow Mong Goh
39
2
0
08 Jun 2020
Generative Design of Hardware-aware DNNs
Generative Design of Hardware-aware DNNs
Sheng-Chun Kao
Arun Ramamurthy
T. Krishna
MQ
19
2
0
06 Jun 2020
High-level Modeling of Manufacturing Faults in Deep Neural Network
  Accelerators
High-level Modeling of Manufacturing Faults in Deep Neural Network Accelerators
Shamik Kundu
Ahmet Soyyiğit
K. A. Hoque
K. Basu
6
11
0
05 Jun 2020
Sponge Examples: Energy-Latency Attacks on Neural Networks
Sponge Examples: Energy-Latency Attacks on Neural Networks
Ilia Shumailov
Yiren Zhao
Daniel Bates
Nicolas Papernot
Robert D. Mullins
Ross J. Anderson
SILM
19
127
0
05 Jun 2020
Daydream: Accurately Estimating the Efficacy of Optimizations for DNN
  Training
Daydream: Accurately Estimating the Efficacy of Optimizations for DNN Training
Hongyu Zhu
Amar Phanishayee
Gennady Pekhimenko
23
50
0
05 Jun 2020
Exploring the Potential of Low-bit Training of Convolutional Neural
  Networks
Exploring the Potential of Low-bit Training of Convolutional Neural Networks
Kai Zhong
Xuefei Ning
Guohao Dai
Zhenhua Zhu
Tianchen Zhao
Shulin Zeng
Yu Wang
Huazhong Yang
MQ
25
9
0
04 Jun 2020
Serving DNNs like Clockwork: Performance Predictability from the Bottom
  Up
Serving DNNs like Clockwork: Performance Predictability from the Bottom Up
A. Gujarati
Reza Karimi
Safya Alzayat
Wei Hao
Antoine Kaufmann
Ymir Vigfusson
Jonathan Mace
37
271
0
03 Jun 2020
Light-in-the-loop: using a photonics co-processor for scalable training
  of neural networks
Light-in-the-loop: using a photonics co-processor for scalable training of neural networks
Julien Launay
Iacopo Poli
Kilian Muller
I. Carron
L. Daudet
Florent Krzakala
S. Gigan
21
6
0
02 Jun 2020
PolyDL: Polyhedral Optimizations for Creation of High Performance DL
  primitives
PolyDL: Polyhedral Optimizations for Creation of High Performance DL primitives
Sanket Tavarageri
A. Heinecke
Sasikanth Avancha
Gagandeep Goyal
Ramakrishna Upadrasta
Bharat Kaul
14
12
0
02 Jun 2020
Vyasa: A High-Performance Vectorizing Compiler for Tensor Convolutions
  on the Xilinx AI Engine
Vyasa: A High-Performance Vectorizing Compiler for Tensor Convolutions on the Xilinx AI Engine
Prasanth Chatarasi
S. Neuendorffer
Samuel Bayliss
K. Vissers
Vivek Sarkar
6
18
0
02 Jun 2020
SiEVE: Semantically Encoded Video Analytics on Edge and Cloud
SiEVE: Semantically Encoded Video Analytics on Edge and Cloud
Tarek Elgamal
Shu Shi
Varun Gupta
R. Jana
Klara Nahrstedt
11
15
0
01 Jun 2020
Artificial neural networks for neuroscientists: A primer
Artificial neural networks for neuroscientists: A primer
G. R. Yang
Xiao-Jing Wang
39
243
0
01 Jun 2020
Climbing down Charney's ladder: Machine Learning and the post-Dennard
  era of computational climate science
Climbing down Charney's ladder: Machine Learning and the post-Dennard era of computational climate science
Venkatramani Balaji
AI4CE
24
50
0
24 May 2020
HyperLogLog Sketch Acceleration on FPGA
HyperLogLog Sketch Acceleration on FPGA
Amit Kulkarni
Monica Chiosa
Thomas B. Preußer
Kaan Kara
David Sidler
Gustavo Alonso
8
20
0
24 May 2020
Conditionally Deep Hybrid Neural Networks Across Edge and Cloud
Conditionally Deep Hybrid Neural Networks Across Edge and Cloud
Yinghan Long
I. Chakraborty
Kaushik Roy
11
4
0
21 May 2020
Systolic Tensor Array: An Efficient Structured-Sparse GEMM Accelerator
  for Mobile CNN Inference
Systolic Tensor Array: An Efficient Structured-Sparse GEMM Accelerator for Mobile CNN Inference
Zhi-Gang Liu
P. Whatmough
Matthew Mattina
9
80
0
16 May 2020
OD-SGD: One-step Delay Stochastic Gradient Descent for Distributed
  Training
OD-SGD: One-step Delay Stochastic Gradient Descent for Distributed Training
Yemao Xu
Dezun Dong
Weixia Xu
Xiangke Liao
11
7
0
14 May 2020
Deep Learning: Our Miraculous Year 1990-1991
Deep Learning: Our Miraculous Year 1990-1991
J. Schmidhuber
3DGS
MedIm
22
6
0
12 May 2020
Centaur: A Chiplet-based, Hybrid Sparse-Dense Accelerator for
  Personalized Recommendations
Centaur: A Chiplet-based, Hybrid Sparse-Dense Accelerator for Personalized Recommendations
Ranggi Hwang
Taehun Kim
Youngeun Kwon
Minsoo Rhu
18
104
0
12 May 2020
Optimizing Deep Learning Recommender Systems' Training On CPU Cluster
  Architectures
Optimizing Deep Learning Recommender Systems' Training On CPU Cluster Architectures
Dhiraj D. Kalamkar
E. Georganas
Sudarshan Srinivasan
Jianping Chen
Mikhail Shiryaev
A. Heinecke
56
48
0
10 May 2020
GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy
  Efficient Inference
GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference
Ali Hadi Zadeh
Isak Edo
Omar Mohamed Awad
Andreas Moshovos
MQ
30
185
0
08 May 2020
One-step regression and classification with crosspoint resistive memory
  arrays
One-step regression and classification with crosspoint resistive memory arrays
Zhong Sun
Giacomo Pedretti
A. Bricalli
Daniele Ielmini
6
75
0
05 May 2020
An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for
  Neural Network Acceleration
An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration
Behzad Salami
Erhan Baturay Onural
Ismail Emir Yüksel
Fahrettin Koc
Oguz Ergin
A. Cristal
O. Unsal
H. Sarbazi-Azad
O. Mutlu
27
45
0
04 May 2020
Spiking Neural Networks Hardware Implementations and Challenges: a
  Survey
Spiking Neural Networks Hardware Implementations and Challenges: a Survey
Maxence Bouvier
A. Valentian
T. Mesquida
F. Rummens
M. Reyboz
Elisa Vianello
E. Beigné
61
113
0
04 May 2020
TIMELY: Pushing Data Movements and Interfaces in PIM Accelerators
  Towards Local and in Time Domain
TIMELY: Pushing Data Movements and Interfaces in PIM Accelerators Towards Local and in Time Domain
Weitao Li
Pengfei Xu
Yang Katie Zhao
Haitong Li
Yuan Xie
Yingyan Lin
17
69
0
03 May 2020
Lupulus: A Flexible Hardware Accelerator for Neural Networks
Lupulus: A Flexible Hardware Accelerator for Neural Networks
Andreas Toftegaard Kristensen
R. Giterman
Alexios Balatsoukas-Stimming
A. Burg
36
0
0
03 May 2020
AIBench Training: Balanced Industry-Standard AI Training Benchmarking
AIBench Training: Balanced Industry-Standard AI Training Benchmarking
Fei Tang
Wanling Gao
Jianfeng Zhan
Chuanxin Lan
Xu Wen
...
Yatao Li
Junchao Shao
Zhenyu Wang
Xiaoyu Wang
Hainan Ye
30
3
0
30 Apr 2020
Synergistic CPU-FPGA Acceleration of Sparse Linear Algebra
Synergistic CPU-FPGA Acceleration of Sparse Linear Algebra
M. Soltaniyeh
R. Martin
Santosh Nagarakatte
15
8
0
29 Apr 2020
FlexSA: Flexible Systolic Array Architecture for Efficient Pruned DNN
  Model Training
FlexSA: Flexible Systolic Array Architecture for Efficient Pruned DNN Model Training
Sangkug Lym
M. Erez
21
25
0
27 Apr 2020
Memory-efficient training with streaming dimensionality reduction
Memory-efficient training with streaming dimensionality reduction
Siyuan Huang
Brian D. Hoskins
M. Daniels
M. D. Stiles
G. Adam
14
3
0
25 Apr 2020
PERMDNN: Efficient Compressed DNN Architecture with Permuted Diagonal
  Matrices
PERMDNN: Efficient Compressed DNN Architecture with Permuted Diagonal Matrices
Chunhua Deng
Siyu Liao
Yi Xie
Keshab K. Parhi
Xuehai Qian
Bo Yuan
46
93
0
23 Apr 2020
DRMap: A Generic DRAM Data Mapping Policy for Energy-Efficient
  Processing of Convolutional Neural Networks
DRMap: A Generic DRAM Data Mapping Policy for Energy-Efficient Processing of Convolutional Neural Networks
Rachmad Vidya Wicaksana Putra
Muhammad Abdullah Hanif
Mohamed Bennai
11
22
0
21 Apr 2020
MGX: Near-Zero Overhead Memory Protection for Data-Intensive
  Accelerators
MGX: Near-Zero Overhead Memory Protection for Data-Intensive Accelerators
Weizhe Hua
M. Umar
Zhiru Zhang
G. E. Suh
GNN
44
19
0
20 Apr 2020
Integer Quantization for Deep Learning Inference: Principles and
  Empirical Evaluation
Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation
Hao Wu
Patrick Judd
Xiaojie Zhang
Mikhail Isaev
Paulius Micikevicius
MQ
43
340
0
20 Apr 2020
Previous
123...141516...222324
Next