Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1704.04760
Cited By
In-Datacenter Performance Analysis of a Tensor Processing Unit
16 April 2017
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh Bhatia
Nan Boden
Al Borchers
Rick Boyle
Pierre-luc Cantin
Clifford Chao
Chris Clark
Jeremy Coriell
Mike Daley
Matt Dau
Jeffrey Dean
Ben Gelb
Taraneh Ghaemmaghami
Rajendra Gottipati
William Gulland
Robert Hagmann
C. Richard Ho
Doug Hogberg
John Hu
R. Hundt
Dan Hurt
Julian Ibarz
A. Jaffey
Alek Jaworski
Alexander Kaplan
Harshit Khaitan
Andy Koch
Naveen Kumar
Steve Lacy
James Laudon
James Law
Diemthu Le
Chris Leary
Zhuyuan Liu
Kyle Lucke
Alan Lundin
Gordon MacKean
Adriana Maggiore
Maire Mahony
Kieran Miller
R. Nagarajan
Ravi Narayanaswami
Ray Ni
Kathy Nix
Thomas Norrie
Mark Omernick
Narayana Penukonda
Andy Phelps
Jonathan Ross
Matt Ross
Amir Salek
Emad Samadiani
Chris Severn
Gregory Sizikov
Matthew Snelham
Jed Souter
Dan Steinberg
Andy Swing
Mercedes Tan
Gregory Thorson
Bo Tian
Horia Toma
Erick Tuttle
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"In-Datacenter Performance Analysis of a Tensor Processing Unit"
50 / 1,167 papers shown
Title
AdaptivFloat: A Floating-point based Data Type for Resilient Deep Learning Inference
Thierry Tambe
En-Yu Yang
Zishen Wan
Yuntian Deng
Vijay Janapa Reddi
Alexander M. Rush
David Brooks
Gu-Yeon Wei
MQ
58
21
0
29 Sep 2019
Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator
Tian Zhao
Yaqi Zhang
K. Olukotun
63
16
0
26 Sep 2019
CAT: Compression-Aware Training for bandwidth reduction
Chaim Baskin
Brian Chmiel
Evgenii Zheltonozhskii
Ron Banner
A. Bronstein
A. Mendelson
MQ
67
12
0
25 Sep 2019
Accelerating convolutional neural network by exploiting sparsity on GPUs
Weizhi Xu
Yintai Sun
Shengyu Fan
Hui Yu
Xin Fu
99
7
0
22 Sep 2019
Scale MLPerf-0.6 models on Google TPU-v3 Pods
Sameer Kumar
Victor Bitorff
Dehao Chen
Chi-Heng Chou
Blake A. Hechtman
...
Peter Mattson
Shibo Wang
Tao Wang
Yuanzhong Xu
Zongwei Zhou
87
39
0
21 Sep 2019
SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems
Xiaofan Zhang
Haoming Lu
Cong Hao
Jiachen Li
Bowen Cheng
...
Jinjun Xiong
Thomas Huang
Humphrey Shi
Wen-mei W. Hwu
Deming Chen
106
92
0
20 Sep 2019
A Data-Center FPGA Acceleration Platform for Convolutional Neural Networks
Xiaoyu Yu
Yuwei Wang
Jie Miao
Ephrem Wu
Heng Zhang
Yu Meng
Bo Zhang
Biao Min
Dewei Chen
Jianlin Gao
54
21
0
17 Sep 2019
High-Throughput In-Memory Computing for Binary Deep Neural Networks with Monolithically Integrated RRAM and 90nm CMOS
Shihui Yin
Xiaoyu Sun
Shimeng Yu
Jae-sun Seo
MQ
31
106
0
16 Sep 2019
Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training
Yuxin Wang
Qiang-qiang Wang
Shaoshuai Shi
Xin He
Zhenheng Tang
Kaiyong Zhao
Xiaowen Chu
119
3
0
15 Sep 2019
Heterogeneous Dataflow Accelerators for Multi-DNN Workloads
Hyoukjun Kwon
Liangzhen Lai
Michael Pellauer
T. Krishna
Yu-Hsin Chen
Vikas Chandra
84
18
0
13 Sep 2019
DASNet: Dynamic Activation Sparsity for Neural Network Efficiency Improvement
Qing Yang
Jiachen Mao
Zuoguan Wang
H. Li
65
15
0
13 Sep 2019
Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model
Anjuli Kannan
A. Datta
Tara N. Sainath
Eugene Weinstein
Bhuvana Ramabhadran
Yonghui Wu
Ankur Bapna
Zhiwen Chen
Seungjin Lee
AuLLM
71
174
0
11 Sep 2019
Unrolling Ternary Neural Networks
Stephen Tridgell
M. Kumm
M. Hardieck
David Boland
Duncan J. M. Moss
P. Zipf
Philip H. W. Leong
63
28
0
09 Sep 2019
PREMA: A Predictive Multi-task Scheduling Algorithm For Preemptible Neural Processing Units
Yujeong Choi
Minsoo Rhu
61
135
0
06 Sep 2019
ModiPick: SLA-aware Accuracy Optimization For Mobile Deep Inference
Samuel S. Ogden
Tian Guo
24
3
0
04 Sep 2019
Sparse Deep Neural Network Graph Challenge
J. Kepner
Simon Alford
V. Gadepally
Michael Jones
Lauren Milechin
Ryan A. Robinett
S. Samsi
GNN
57
49
0
02 Sep 2019
Taskmaster-1: Toward a Realistic and Diverse Dialog Dataset
Bill Byrne
Karthikeyan K
Chinnadhurai Sankar
Arvind Neelakantan
Daniel Duckworth
Semih Yavuz
Ben Goodrich
Amit Dubey
A. Cedilnik
Kyu-Young Kim
67
219
0
01 Sep 2019
TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in TensorFlow Using Tapir
S. Samsi
Michael Houle
24
4
0
29 Aug 2019
High Performance Scalable FPGA Accelerator for Deep Neural Networks
Sudarshan Srinivasan
Pradeep Janedula
Saurabh Dhoble
Sasikanth Avancha
Dipankar Das
Naveen Mellempudi
Bharat Daga
M. Langhammer
Gregg Baeckler
Bharat Kaul
18
3
0
29 Aug 2019
Extending TensorFlow's Semantics with Pipelined Execution
S. Whitlock
James R. Larus
Edouard Bugnion
19
1
0
25 Aug 2019
A Computational Model for Tensor Core Units
Rezaul Chowdhury
Francesco Silvestri
Flavio Vella
29
15
0
19 Aug 2019
Automatic Compiler Based FPGA Accelerator for CNN Training
S. Venkataramanaiah
Yufei Ma
Shihui Yin
Eriko Nurvitadhi
A. Dasu
Yu Cao
Jae-sun Seo
62
37
0
15 Aug 2019
Accelerated CNN Training Through Gradient Approximation
Ziheng Wang
Sree Harsha Nelaturu
366
5
0
15 Aug 2019
AIBench: An Industry Standard Internet Service AI Benchmark Suite
Wanling Gao
Fei Tang
Lei Wang
Jianfeng Zhan
Chunxin Lan
...
Yatao Li
Junchao Shao
Zhenyu Wang
Xiaoyu Wang
Hainan Ye
68
45
0
13 Aug 2019
TensorDIMM: A Practical Near-Memory Processing Architecture for Embeddings and Tensor Operations in Deep Learning
Youngeun Kwon
Yunjae Lee
Minsoo Rhu
86
215
0
08 Aug 2019
3D-aCortex: An Ultra-Compact Energy-Efficient Neurocomputing Platform Based on Commercial 3D-NAND Flash Memories
Mohammad Bavandpour
Shubham Sahay
M. Mahmoodi
D. Strukov
48
30
0
07 Aug 2019
Tuning Algorithms and Generators for Efficient Edge Inference
R. Naous
Lazar Supic
Yoonhwan Kang
Ranko Seradejovic
Anish Singhani
Vladimir M. Stojanović
16
2
0
31 Jul 2019
HPC AI500: A Benchmark Suite for HPC AI Systems
Zihan Jiang
Wanling Gao
Lei Wang
Xingwang Xiong
Yuchen Zhang
...
Yunquan Zhang
Shengzhong Feng
KenLi Li
Weijia Xu
Jianfeng Zhan
ELM
68
40
0
27 Jul 2019
Benchmarking TPU, GPU, and CPU Platforms for Deep Learning
Y. Wang
Gu-Yeon Wei
David Brooks
ELM
VLM
97
278
0
24 Jul 2019
A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection
Yue Liu
Zeyi Wen
Zhaomin Wu
Sixu Hu
Naibo Wang
Yuan N. Li
Xu Liu
Bingsheng He
FedML
130
1,014
0
23 Jul 2019
Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN Inference
Weiwen Jiang
E. Sha
Xinyi Zhang
Lei Yang
Qingfeng Zhuge
Yiyu Shi
Jiaxi Hu
95
75
0
21 Jul 2019
Convergence of Edge Computing and Deep Learning: A Comprehensive Survey
Xiaofei Wang
Yiwen Han
Victor C. M. Leung
Dusit Niyato
Xueqiang Yan
Xu Chen
102
1,002
0
19 Jul 2019
A Versatile Software Systolic Execution Model for GPU Memory-Bound Kernels
Peng Chen
Mohamed Wahib
Shiníchiro Takizawa
Ryousei Takano
Satoshi Matsuoka
37
22
0
14 Jul 2019
A semi-holographic hyperdimensional representation system for hardware-friendly cognitive computing
Alexandrou Serb
I. Kobyzev
Jiaqi Wang
T. Prodromakis
41
3
0
12 Jul 2019
VarGNet: Variable Group Convolutional Neural Network for Efficient Embedded Computing
Qian Zhang
Jianjun Li
Meng Yao
Liangchen Song
Helong Zhou
Zhichao Li
Wenming Meng
Xuezhi Zhang
Guoli Wang
83
22
0
12 Jul 2019
Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges
N. Arivazhagan
Ankur Bapna
Orhan Firat
Dmitry Lepikhin
Melvin Johnson
...
George F. Foster
Colin Cherry
Wolfgang Macherey
Zhiwen Chen
Yonghui Wu
105
429
0
11 Jul 2019
Making AI Forget You: Data Deletion in Machine Learning
Antonio A. Ginart
M. Guan
Gregory Valiant
James Zou
MU
112
481
0
11 Jul 2019
Joint Speech Recognition and Speaker Diarization via Sequence Transduction
Laurent El Shafey
H. Soltau
Izhak Shafran
81
104
0
09 Jul 2019
Point-Voxel CNN for Efficient 3D Deep Learning
Zhijian Liu
Haotian Tang
Chengyue Wu
Song Han
3DPC
167
677
0
08 Jul 2019
Speech bandwidth extension with WaveNet
Archit Gupta
Brendan Shillingford
Yannis Assael
Thomas C. Walters
60
29
0
05 Jul 2019
Single-Path Mobile AutoML: Efficient ConvNet Design and NAS Hyperparameter Optimization
Dimitrios Stamoulis
Ruizhou Ding
Di Wang
Dimitrios Lymberopoulos
B. Priyantha
Jie Liu
Diana Marculescu
59
34
0
01 Jul 2019
On improving deep learning generalization with adaptive sparse connectivity
Shiwei Liu
Decebal Constantin Mocanu
Mykola Pechenizkiy
ODL
39
8
0
27 Jun 2019
Learning Data Augmentation Strategies for Object Detection
Barret Zoph
E. D. Cubuk
Golnaz Ghiasi
Nayeon Lee
Jonathon Shlens
Quoc V. Le
108
534
0
26 Jun 2019
ALTIS: Modernizing GPGPU Benchmarking
Bodun Hu
Christopher J. Rossbach
19
3
0
25 Jun 2019
The Coming Age of Pervasive Data Processing
Jan S. Rellermeyer
Sobhan Omranian Khorasani
D. Graur
Apourva Parthasarathy
150
5
0
21 Jun 2019
Joint Regularization on Activations and Weights for Efficient Neural Network Pruning
Q. Yang
W. Wen
Zuoguan Wang
H. Li
34
1
0
19 Jun 2019
High-Performance Deep Learning via a Single Building Block
E. Georganas
K. Banerjee
Dhiraj D. Kalamkar
Sasikanth Avancha
Anand Venkat
Michael J. Anderson
G. Henry
Hans Pabst
A. Heinecke
43
12
0
15 Jun 2019
Stand-Alone Self-Attention in Vision Models
Prajit Ramachandran
Niki Parmar
Ashish Vaswani
Irwan Bello
Anselm Levskaya
Jonathon Shlens
VLM
SLR
ViT
158
1,217
0
13 Jun 2019
Parameterized Structured Pruning for Deep Neural Networks
Günther Schindler
Wolfgang Roth
Franz Pernkopf
Holger Froening
51
6
0
12 Jun 2019
PABO: Pseudo Agent-Based Multi-Objective Bayesian Hyperparameter Optimization for Efficient Neural Accelerator Design
Maryam Parsa
Aayush Ankit
A. Ziabari
Kaushik Roy
79
28
0
11 Jun 2019
Previous
1
2
3
...
17
18
19
...
22
23
24
Next