ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.04760
  4. Cited By
In-Datacenter Performance Analysis of a Tensor Processing Unit

In-Datacenter Performance Analysis of a Tensor Processing Unit

16 April 2017
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh Bhatia
Nan Boden
Al Borchers
Rick Boyle
Pierre-luc Cantin
Clifford Chao
Chris Clark
Jeremy Coriell
Mike Daley
Matt Dau
Jeffrey Dean
Ben Gelb
Taraneh Ghaemmaghami
Rajendra Gottipati
William Gulland
Robert Hagmann
C. Richard Ho
Doug Hogberg
John Hu
R. Hundt
Dan Hurt
Julian Ibarz
A. Jaffey
Alek Jaworski
Alexander Kaplan
Harshit Khaitan
Andy Koch
Naveen Kumar
Steve Lacy
James Laudon
James Law
Diemthu Le
Chris Leary
Zhuyuan Liu
Kyle Lucke
Alan Lundin
Gordon MacKean
Adriana Maggiore
Maire Mahony
Kieran Miller
R. Nagarajan
Ravi Narayanaswami
Ray Ni
Kathy Nix
Thomas Norrie
Mark Omernick
Narayana Penukonda
Andy Phelps
Jonathan Ross
Matt Ross
Amir Salek
Emad Samadiani
Chris Severn
Gregory Sizikov
Matthew Snelham
Jed Souter
Dan Steinberg
Andy Swing
Mercedes Tan
Gregory Thorson
Bo Tian
Horia Toma
Erick Tuttle
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
ArXivPDFHTML

Papers citing "In-Datacenter Performance Analysis of a Tensor Processing Unit"

50 / 1,165 papers shown
Title
Measuring the Effects of Data Parallelism on Neural Network Training
Measuring the Effects of Data Parallelism on Neural Network Training
Christopher J. Shallue
Jaehoon Lee
J. Antognini
J. Mamou
J. Ketterling
Yao Wang
49
408
0
08 Nov 2018
Packing Sparse Convolutional Neural Networks for Efficient Systolic
  Array Implementations: Column Combining Under Joint Optimization
Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization
H. T. Kung
Bradley McDanel
Shanghang Zhang
35
133
0
07 Nov 2018
RNNFast: An Accelerator for Recurrent Neural Networks Using Domain Wall
  Memory
RNNFast: An Accelerator for Recurrent Neural Networks Using Domain Wall Memory
Mohammad Hossein Samavatian
Anys Bacha
Li Zhou
R. Teodorescu
32
7
0
07 Nov 2018
Simple, Distributed, and Accelerated Probabilistic Programming
Simple, Distributed, and Accelerated Probabilistic Programming
Like Hui
Matthew Hoffman
Siyuan Ma
Christopher Suter
Srinivas Vasudevan
Alexey Radul
M. Belkin
Rif A. Saurous
BDL
30
56
0
05 Nov 2018
CapsAcc: An Efficient Hardware Accelerator for CapsuleNets with Data
  Reuse
CapsAcc: An Efficient Hardware Accelerator for CapsuleNets with Data Reuse
Alberto Marchisio
Muhammad Abdullah Hanif
Mohamed Bennai
14
25
0
02 Nov 2018
Low-Precision Random Fourier Features for Memory-Constrained Kernel
  Approximation
Low-Precision Random Fourier Features for Memory-Constrained Kernel Approximation
Jian Zhang
Avner May
Tri Dao
Christopher Ré
13
29
0
31 Oct 2018
Democratizing Production-Scale Distributed Deep Learning
Democratizing Production-Scale Distributed Deep Learning
Minghuang Ma
Hadi Pouransari
Daniel Chao
Saurabh N. Adya
S. Serrano
Yi Qin
Dan Gimnicher
Dominic Walsh
MoE
36
6
0
31 Oct 2018
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural
  Networks
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks
Boyu Zhang
A. Davoodi
Y. Hu
MoE
6
2
0
31 Oct 2018
A mixed signal architecture for convolutional neural networks
A mixed signal architecture for convolutional neural networks
M. Gorlatova
C. Pan
John McGuiness
Andras Horvath
A. Naeemi
Michael Niemier
X. S. Hu
3DV
18
24
0
30 Oct 2018
MPNA: A Massively-Parallel Neural Array Accelerator with Dataflow
  Optimization for Convolutional Neural Networks
MPNA: A Massively-Parallel Neural Array Accelerator with Dataflow Optimization for Convolutional Neural Networks
Muhammad Abdullah Hanif
Rachmad Vidya Wicaksana Putra
Muhammad Tanvir
R. Hafiz
Semeen Rehman
Mohamed Bennai
17
17
0
30 Oct 2018
Whetstone: A Method for Training Deep Artificial Neural Networks for
  Binary Communication
Whetstone: A Method for Training Deep Artificial Neural Networks for Binary Communication
William M. Severa
C. Vineyard
Ryan Dellana
Stephen J Verzi
J. Aimone
21
95
0
26 Oct 2018
Efficient learning of neighbor representations for boundary trees and
  forests
Efficient learning of neighbor representations for boundary trees and forests
Tharindu B. Adikari
S. Draper
19
2
0
26 Oct 2018
Double-precision FPUs in High-Performance Computing: an Embarrassment of
  Riches?
Double-precision FPUs in High-Performance Computing: an Embarrassment of Riches?
Jens Domke
Kazuaki Matsumura
Mohamed Wahib
Haoyu Zhang
Keita Yashima
Toshiki Tsuchikawa
Yohei Tsuji
Artur Podobas
Satoshi Matsuoka
14
17
0
22 Oct 2018
SCALE-Sim: Systolic CNN Accelerator Simulator
SCALE-Sim: Systolic CNN Accelerator Simulator
A. Samajdar
Yuhao Zhu
P. Whatmough
Matthew Mattina
Tushar Krishna
30
137
0
16 Oct 2018
Morph: Flexible Acceleration for 3D CNN-based Video Understanding
Morph: Flexible Acceleration for 3D CNN-based Video Understanding
Kartik Hegde
R. Agrawal
Yulun Yao
Christopher W. Fletcher
33
71
0
16 Oct 2018
Embedded deep learning in ophthalmology: Making ophthalmic imaging
  smarter
Embedded deep learning in ophthalmology: Making ophthalmic imaging smarter
Petteri Teikari
Raymond P. Najjar
L. Schmetterer
D. Milea
MedIm
24
27
0
13 Oct 2018
A Closer Look at Structured Pruning for Neural Network Compression
A Closer Look at Structured Pruning for Neural Network Compression
Elliot J. Crowley
Jack Turner
Amos Storkey
Michael F. P. O'Boyle
3DPC
29
31
0
10 Oct 2018
Characterizing Deep-Learning I/O Workloads in TensorFlow
Characterizing Deep-Learning I/O Workloads in TensorFlow
Steven W. D. Chien
Stefano Markidis
C. Sishtla
Luís Santos
Pawel Herman
Sai B. Narasimhamurthy
Erwin Laure
21
50
0
06 Oct 2018
Towards Fast and Energy-Efficient Binarized Neural Network Inference on
  FPGA
Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA
Cheng Fu
Shilin Zhu
Hao Su
Ching-En Lee
Jishen Zhao
MQ
25
31
0
04 Oct 2018
Mini-batch Serialization: CNN Training with Inter-layer Data Reuse
Mini-batch Serialization: CNN Training with Inter-layer Data Reuse
Sangkug Lym
Armand Behroozi
W. Wen
Ge Li
Yongkee Kwon
M. Erez
14
25
0
30 Sep 2018
NICE: Noise Injection and Clamping Estimation for Neural Network
  Quantization
NICE: Noise Injection and Clamping Estimation for Neural Network Quantization
Chaim Baskin
Natan Liss
Yoav Chai
Evgenii Zheltonozhskii
Eli Schwartz
Raja Giryes
A. Mendelson
A. Bronstein
MQ
17
60
0
29 Sep 2018
Learning Recurrent Binary/Ternary Weights
Learning Recurrent Binary/Ternary Weights
A. Ardakani
Zhengyun Ji
S. C. Smithson
B. Meyer
W. Gross
MQ
14
27
0
28 Sep 2018
Intelligence Beyond the Edge: Inference on Intermittent Embedded Systems
Intelligence Beyond the Edge: Inference on Intermittent Embedded Systems
Graham Gobieski
Nathan Beckmann
Brandon Lucia
14
203
0
28 Sep 2018
Relay: A New IR for Machine Learning Frameworks
Relay: A New IR for Machine Learning Frameworks
Jared Roesch
Steven Lyubomirsky
Logan Weber
Josh Pollock
Marisa Kirisame
Tianqi Chen
Zachary Tatlock
23
104
0
26 Sep 2018
From Audio to Semantics: Approaches to end-to-end spoken language
  understanding
From Audio to Semantics: Approaches to end-to-end spoken language understanding
Parisa Haghani
A. Narayanan
M. Bacchiani
Galen Chuang
Neeraj Gaur
Pedro J. Moreno
Rohit Prabhavalkar
Zhongdi Qu
Austin Waters
18
150
0
24 Sep 2018
Neural Network Decoders for Large-Distance 2D Toric Codes
Neural Network Decoders for Large-Distance 2D Toric Codes
Xiaotong Ni
13
43
0
18 Sep 2018
MotherNets: Rapid Deep Ensemble Learning
MotherNets: Rapid Deep Ensemble Learning
Abdul Wasay
Brian Hentschel
Yuze Liao
Sanyuan Chen
Stratos Idreos
14
35
0
12 Sep 2018
Interstellar: Using Halide's Scheduling Language to Analyze DNN
  Accelerators
Interstellar: Using Halide's Scheduling Language to Analyze DNN Accelerators
Xuan S. Yang
Mingyu Gao
Qiaoyi Liu
Jeff Setter
Jing Pu
...
Kaidi Cao
Heonjae Ha
Priyanka Raina
Christos Kozyrakis
M. Horowitz
29
226
0
10 Sep 2018
Optimizing CNN Model Inference on CPUs
Optimizing CNN Model Inference on CPUs
Yizhi Liu
Yao Wang
Ruofei Yu
Mu Li
Vin Sharma
Yida Wang
12
152
0
07 Sep 2018
Data Motifs: A Lens Towards Fully Understanding Big Data and AI
  Workloads
Data Motifs: A Lens Towards Fully Understanding Big Data and AI Workloads
Wanling Gao
Jianfeng Zhan
Lei Wang
Chunjie Luo
Daoyi Zheng
...
Chen Zheng
Xu Wen
Qiang Yang
Haibin Wang
Rui Ren
19
35
0
26 Aug 2018
Fast Spectrogram Inversion using Multi-head Convolutional Neural
  Networks
Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks
Sercan Ö. Arik
Heewoo Jun
G. Diamos
14
107
0
20 Aug 2018
Navigating the Landscape for Real-time Localisation and Mapping for
  Robotics and Virtual and Augmented Reality
Navigating the Landscape for Real-time Localisation and Mapping for Robotics and Virtual and Augmented Reality
Sajad Saeedi
Bruno Bodin
Harry Wagstaff
A. Nisbet
Luigi Nardi
...
Michael F. P. O'Boyle
Andrew J. Davison
Paul H. J. Kelly
M. Luján
Steve Furber
20
40
0
20 Aug 2018
A study on speech enhancement using exponent-only floating point
  quantized neural network (EOFP-QNN)
A study on speech enhancement using exponent-only floating point quantized neural network (EOFP-QNN)
Y. Hsu
Yu-Chen Lin
Szu-Wei Fu
Yu Tsao
Tei-Wei Kuo
MQ
22
15
0
17 Aug 2018
Toward domain-invariant speech recognition via large scale training
Toward domain-invariant speech recognition via large scale training
A. Narayanan
Ananya Misra
K. Sim
Golan Pundak
Anshuman Tripathi
Mohamed G. Elfeky
Parisa Haghani
Trevor Strohman
M. Bacchiani
VLM
23
107
0
16 Aug 2018
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Amrita Mathuriya
Deborah Bard
P. Mendygral
Lawrence Meadows
James A. Arnemann
...
Nalini Kumar
S. Ho
Michael F. Ringenburg
P. Prabhat
Victor W. Lee
AI4CE
12
125
0
14 Aug 2018
StructADMM: A Systematic, High-Efficiency Framework of Structured Weight
  Pruning for DNNs
StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNs
Tianyun Zhang
Shaokai Ye
Kaiqi Zhang
Xiaolong Ma
Ning Liu
...
Jian Tang
Kaisheng Ma
Xue Lin
M. Fardad
Yanzhi Wang
25
50
0
29 Jul 2018
TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing
TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing
Augustus Odena
Ian Goodfellow
AAML
23
320
0
28 Jul 2018
A Comparison of Techniques for Language Model Integration in
  Encoder-Decoder Speech Recognition
A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition
Shubham Toshniwal
Anjuli Kannan
Chung-Cheng Chiu
Yonghui Wu
Tara N. Sainath
Karen Livescu
27
157
0
27 Jul 2018
Supporting Very Large Models using Automatic Dataflow Graph Partitioning
Supporting Very Large Models using Automatic Dataflow Graph Partitioning
Minjie Wang
Chien-chin Huang
Jinyang Li
49
154
0
24 Jul 2018
Text Classification based on Multiple Block Convolutional Highways
Text Classification based on Multiple Block Convolutional Highways
S. M. Rezaeinia
A. Ghodsi
R. Rahmani
AI4TS
17
5
0
23 Jul 2018
Recent Advances in Convolutional Neural Network Acceleration
Recent Advances in Convolutional Neural Network Acceleration
Qianru Zhang
Meng Zhang
Tinghuan Chen
Zhifei Sun
Yuzhe Ma
Bei Yu
36
348
0
23 Jul 2018
A Hardware-Software Blueprint for Flexible Deep Learning Specialization
A Hardware-Software Blueprint for Flexible Deep Learning Specialization
T. Moreau
Tianqi Chen
Luis Vega
Jared Roesch
Eddie Q. Yan
...
Josh Fromm
Ziheng Jiang
Luis Ceze
Carlos Guestrin
Arvind Krishnamurthy
29
70
0
11 Jul 2018
Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on
  Mobile Devices
Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices
Yu-hsin Chen
Tien-Ju Yang
J. Emer
Vivienne Sze
MQ
18
70
0
10 Jul 2018
Pooling Pyramid Network for Object Detection
Pooling Pyramid Network for Object Detection
Pengchong Jin
V. Rathod
Xiangxin Zhu
ObjD
19
20
0
09 Jul 2018
Progressive Spatial Recurrent Neural Network for Intra Prediction
Progressive Spatial Recurrent Neural Network for Intra Prediction
Yueyu Hu
Wenhan Yang
Mading Li
Jiaying Liu
27
57
0
06 Jul 2018
Sparse Deep Neural Network Exact Solutions
Sparse Deep Neural Network Exact Solutions
J. Kepner
V. Gadepally
Hayden Jananthan
Lauren Milechin
S. Samsi
22
14
0
06 Jul 2018
Restructuring Batch Normalization to Accelerate CNN Training
Restructuring Batch Normalization to Accelerate CNN Training
Wonkyung Jung
Daejin Jung
and Byeongho Kim
Sunjung Lee
Wonjong Rhee
Jung Ho Ahn
24
62
0
04 Jul 2018
A Survey on Agent-based Simulation using Hardware Accelerators
A Survey on Agent-based Simulation using Hardware Accelerators
Jiajian Xiao
Philipp Andelfinger
D. Eckhoff
Wentong Cai
Alois Knoll
6
44
0
03 Jul 2018
Classifying neuromorphic data using a deep learning framework for image
  classification
Classifying neuromorphic data using a deep learning framework for image classification
Roshan Gopalakrishnan
Yansong Chua
Laxmi R. Iyer
8
7
0
02 Jul 2018
FATE: Fast and Accurate Timing Error Prediction Framework for Low Power
  DNN Accelerator Design
FATE: Fast and Accurate Timing Error Prediction Framework for Low Power DNN Accelerator Design
J. Zhang
S. Garg
24
21
0
02 Jul 2018
Previous
123...2021222324
Next