Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1704.04760
Cited By
In-Datacenter Performance Analysis of a Tensor Processing Unit
16 April 2017
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh Bhatia
Nan Boden
Al Borchers
Rick Boyle
Pierre-luc Cantin
Clifford Chao
Chris Clark
Jeremy Coriell
Mike Daley
Matt Dau
Jeffrey Dean
Ben Gelb
Taraneh Ghaemmaghami
Rajendra Gottipati
William Gulland
Robert Hagmann
C. Richard Ho
Doug Hogberg
John Hu
R. Hundt
Dan Hurt
Julian Ibarz
A. Jaffey
Alek Jaworski
Alexander Kaplan
Harshit Khaitan
Andy Koch
Naveen Kumar
Steve Lacy
James Laudon
James Law
Diemthu Le
Chris Leary
Zhuyuan Liu
Kyle Lucke
Alan Lundin
Gordon MacKean
Adriana Maggiore
Maire Mahony
Kieran Miller
R. Nagarajan
Ravi Narayanaswami
Ray Ni
Kathy Nix
Thomas Norrie
Mark Omernick
Narayana Penukonda
Andy Phelps
Jonathan Ross
Matt Ross
Amir Salek
Emad Samadiani
Chris Severn
Gregory Sizikov
Matthew Snelham
Jed Souter
Dan Steinberg
Andy Swing
Mercedes Tan
Gregory Thorson
Bo Tian
Horia Toma
Erick Tuttle
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"In-Datacenter Performance Analysis of a Tensor Processing Unit"
50 / 1,167 papers shown
Title
Image Classification at Supercomputer Scale
Chris Ying
Sameer Kumar
Dehao Chen
Tao Wang
Youlong Cheng
VLM
68
123
0
16 Nov 2018
Streaming End-to-end Speech Recognition For Mobile Devices
Yanzhang He
Tara N. Sainath
Rohit Prabhavalkar
Ian McGraw
R. Álvarez
...
K. Sim
Tom Bagby
Shuo-yiin Chang
Kanishka Rao
A. Gruenstein
119
629
0
15 Nov 2018
Performance Estimation of Synthesis Flows cross Technologies using LSTMs and Transfer Learning
Cunxi Yu
Wang Zhou
AI4TS
18
0
0
14 Nov 2018
Measuring the Effects of Data Parallelism on Neural Network Training
Christopher J. Shallue
Jaehoon Lee
J. Antognini
J. Mamou
J. Ketterling
Yao Wang
115
409
0
08 Nov 2018
Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization
H. T. Kung
Bradley McDanel
Shanghang Zhang
93
135
0
07 Nov 2018
RNNFast: An Accelerator for Recurrent Neural Networks Using Domain Wall Memory
Mohammad Hossein Samavatian
Anys Bacha
Li Zhou
R. Teodorescu
56
7
0
07 Nov 2018
Simple, Distributed, and Accelerated Probabilistic Programming
Like Hui
Matthew Hoffman
Siyuan Ma
Christopher Suter
Srinivas Vasudevan
Alexey Radul
M. Belkin
Rif A. Saurous
BDL
85
56
0
05 Nov 2018
CapsAcc: An Efficient Hardware Accelerator for CapsuleNets with Data Reuse
Alberto Marchisio
Muhammad Abdullah Hanif
Mohamed Bennai
56
25
0
02 Nov 2018
Low-Precision Random Fourier Features for Memory-Constrained Kernel Approximation
Jian Zhang
Avner May
Tri Dao
Christopher Ré
80
29
0
31 Oct 2018
Democratizing Production-Scale Distributed Deep Learning
Minghuang Ma
Hadi Pouransari
Daniel Chao
Saurabh N. Adya
S. Serrano
Yi Qin
Dan Gimnicher
Dominic Walsh
MoE
103
6
0
31 Oct 2018
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks
Boyu Zhang
A. Davoodi
Y. Hu
MoE
20
2
0
31 Oct 2018
A mixed signal architecture for convolutional neural networks
M. Gorlatova
C. Pan
John McGuiness
Andras Horvath
A. Naeemi
Michael Niemier
X. S. Hu
3DV
36
24
0
30 Oct 2018
MPNA: A Massively-Parallel Neural Array Accelerator with Dataflow Optimization for Convolutional Neural Networks
Muhammad Abdullah Hanif
Rachmad Vidya Wicaksana Putra
Muhammad Tanvir
R. Hafiz
Semeen Rehman
Mohamed Bennai
41
17
0
30 Oct 2018
Whetstone: A Method for Training Deep Artificial Neural Networks for Binary Communication
William M. Severa
C. Vineyard
Ryan Dellana
Stephen J Verzi
J. Aimone
77
96
0
26 Oct 2018
Efficient learning of neighbor representations for boundary trees and forests
Tharindu B. Adikari
S. Draper
359
2
0
26 Oct 2018
Double-precision FPUs in High-Performance Computing: an Embarrassment of Riches?
Jens Domke
Kazuaki Matsumura
Mohamed Wahib
Haoyu Zhang
Keita Yashima
Toshiki Tsuchikawa
Yohei Tsuji
Artur Podobas
Satoshi Matsuoka
26
17
0
22 Oct 2018
SCALE-Sim: Systolic CNN Accelerator Simulator
A. Samajdar
Yuhao Zhu
P. Whatmough
Matthew Mattina
Tushar Krishna
118
137
0
16 Oct 2018
Morph: Flexible Acceleration for 3D CNN-based Video Understanding
Kartik Hegde
R. Agrawal
Yulun Yao
Christopher W. Fletcher
72
71
0
16 Oct 2018
Embedded deep learning in ophthalmology: Making ophthalmic imaging smarter
Petteri Teikari
Raymond P. Najjar
L. Schmetterer
D. Milea
MedIm
135
27
0
13 Oct 2018
A Closer Look at Structured Pruning for Neural Network Compression
Elliot J. Crowley
Jack Turner
Amos Storkey
Michael F. P. O'Boyle
3DPC
80
31
0
10 Oct 2018
Characterizing Deep-Learning I/O Workloads in TensorFlow
Steven W. D. Chien
Stefano Markidis
C. Sishtla
Luís Santos
Pawel Herman
Sai B. Narasimhamurthy
Erwin Laure
91
50
0
06 Oct 2018
Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA
Cheng Fu
Shilin Zhu
Hao Su
Ching-En Lee
Jishen Zhao
MQ
72
32
0
04 Oct 2018
Mini-batch Serialization: CNN Training with Inter-layer Data Reuse
Sangkug Lym
Armand Behroozi
W. Wen
Ge Li
Yongkee Kwon
M. Erez
41
26
0
30 Sep 2018
NICE: Noise Injection and Clamping Estimation for Neural Network Quantization
Chaim Baskin
Natan Liss
Yoav Chai
Evgenii Zheltonozhskii
Eli Schwartz
Raja Giryes
A. Mendelson
A. Bronstein
MQ
97
62
0
29 Sep 2018
Learning Recurrent Binary/Ternary Weights
A. Ardakani
Zhengyun Ji
S. C. Smithson
B. Meyer
W. Gross
MQ
67
28
0
28 Sep 2018
Intelligence Beyond the Edge: Inference on Intermittent Embedded Systems
Graham Gobieski
Nathan Beckmann
Brandon Lucia
67
207
0
28 Sep 2018
Relay: A New IR for Machine Learning Frameworks
Jared Roesch
Steven Lyubomirsky
Logan Weber
Josh Pollock
Marisa Kirisame
Tianqi Chen
Zachary Tatlock
71
107
0
26 Sep 2018
From Audio to Semantics: Approaches to end-to-end spoken language understanding
Parisa Haghani
A. Narayanan
M. Bacchiani
Galen Chuang
Neeraj Gaur
Pedro J. Moreno
Rohit Prabhavalkar
Zhongdi Qu
Austin Waters
63
151
0
24 Sep 2018
Neural Network Decoders for Large-Distance 2D Toric Codes
Xiaotong Ni
52
45
0
18 Sep 2018
MotherNets: Rapid Deep Ensemble Learning
Abdul Wasay
Brian Hentschel
Yuze Liao
Sanyuan Chen
Stratos Idreos
58
35
0
12 Sep 2018
Interstellar: Using Halide's Scheduling Language to Analyze DNN Accelerators
Xuan S. Yang
Mingyu Gao
Qiaoyi Liu
Jeff Setter
Jing Pu
...
Kaidi Cao
Heonjae Ha
Priyanka Raina
Christos Kozyrakis
M. Horowitz
193
232
0
10 Sep 2018
Optimizing CNN Model Inference on CPUs
Yizhi Liu
Yao Wang
Ruofei Yu
Mu Li
Vin Sharma
Yida Wang
52
156
0
07 Sep 2018
Data Motifs: A Lens Towards Fully Understanding Big Data and AI Workloads
Wanling Gao
Jianfeng Zhan
Lei Wang
Chunjie Luo
Daoyi Zheng
...
Chen Zheng
Xu Wen
Qiang Yang
Haibin Wang
Rui Ren
36
35
0
26 Aug 2018
Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks
Sercan O. Arik
Heewoo Jun
G. Diamos
76
108
0
20 Aug 2018
Navigating the Landscape for Real-time Localisation and Mapping for Robotics and Virtual and Augmented Reality
Sajad Saeedi
Bruno Bodin
Harry Wagstaff
A. Nisbet
Luigi Nardi
...
Michael F. P. O'Boyle
Andrew J. Davison
Paul H. J. Kelly
M. Luján
Steve Furber
58
42
0
20 Aug 2018
A study on speech enhancement using exponent-only floating point quantized neural network (EOFP-QNN)
Y. Hsu
Yu-Chen Lin
Szu-Wei Fu
Yu Tsao
Tei-Wei Kuo
MQ
48
15
0
17 Aug 2018
Toward domain-invariant speech recognition via large scale training
A. Narayanan
Ananya Misra
K. Sim
Golan Pundak
Anshuman Tripathi
Mohamed G. Elfeky
Parisa Haghani
Trevor Strohman
M. Bacchiani
VLM
55
110
0
16 Aug 2018
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Amrita Mathuriya
Deborah Bard
P. Mendygral
Lawrence Meadows
James A. Arnemann
...
Nalini Kumar
S. Ho
Michael F. Ringenburg
P. Prabhat
Victor W. Lee
AI4CE
96
126
0
14 Aug 2018
StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNs
Tianyun Zhang
Shaokai Ye
Kaiqi Zhang
Xiaolong Ma
Ning Liu
...
Jian Tang
Kaisheng Ma
Xue Lin
M. Fardad
Yanzhi Wang
89
51
0
29 Jul 2018
TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing
Augustus Odena
Ian Goodfellow
AAML
76
323
0
28 Jul 2018
A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition
Shubham Toshniwal
Anjuli Kannan
Chung-Cheng Chiu
Yonghui Wu
Tara N. Sainath
Karen Livescu
84
157
0
27 Jul 2018
Supporting Very Large Models using Automatic Dataflow Graph Partitioning
Minjie Wang
Chien-chin Huang
Jinyang Li
120
155
0
24 Jul 2018
Text Classification based on Multiple Block Convolutional Highways
S. M. Rezaeinia
A. Ghodsi
R. Rahmani
AI4TS
47
5
0
23 Jul 2018
Recent Advances in Convolutional Neural Network Acceleration
Qianru Zhang
Meng Zhang
Tinghuan Chen
Zhifei Sun
Yuzhe Ma
Bei Yu
82
351
0
23 Jul 2018
A Hardware-Software Blueprint for Flexible Deep Learning Specialization
T. Moreau
Tianqi Chen
Luis Vega
Jared Roesch
Eddie Q. Yan
...
Josh Fromm
Ziheng Jiang
Luis Ceze
Carlos Guestrin
Arvind Krishnamurthy
74
73
0
11 Jul 2018
Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices
Yu-hsin Chen
Tien-Ju Yang
J. Emer
Vivienne Sze
MQ
43
70
0
10 Jul 2018
Pooling Pyramid Network for Object Detection
Pengchong Jin
V. Rathod
Xiangxin Zhu
ObjD
72
20
0
09 Jul 2018
Progressive Spatial Recurrent Neural Network for Intra Prediction
Yueyu Hu
Wenhan Yang
Mading Li
Jiaying Liu
77
57
0
06 Jul 2018
Sparse Deep Neural Network Exact Solutions
J. Kepner
V. Gadepally
Hayden Jananthan
Lauren Milechin
S. Samsi
113
14
0
06 Jul 2018
Restructuring Batch Normalization to Accelerate CNN Training
Wonkyung Jung
Daejin Jung
and Byeongho Kim
Sunjung Lee
Wonjong Rhee
Jung Ho Ahn
57
64
0
04 Jul 2018
Previous
1
2
3
...
20
21
22
23
24
Next