ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.04760
  4. Cited By
In-Datacenter Performance Analysis of a Tensor Processing Unit

In-Datacenter Performance Analysis of a Tensor Processing Unit

16 April 2017
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh Bhatia
Nan Boden
Al Borchers
Rick Boyle
Pierre-luc Cantin
Clifford Chao
Chris Clark
Jeremy Coriell
Mike Daley
Matt Dau
Jeffrey Dean
Ben Gelb
Taraneh Ghaemmaghami
Rajendra Gottipati
William Gulland
Robert Hagmann
C. Richard Ho
Doug Hogberg
John Hu
R. Hundt
Dan Hurt
Julian Ibarz
A. Jaffey
Alek Jaworski
Alexander Kaplan
Harshit Khaitan
Andy Koch
Naveen Kumar
Steve Lacy
James Laudon
James Law
Diemthu Le
Chris Leary
Zhuyuan Liu
Kyle Lucke
Alan Lundin
Gordon MacKean
Adriana Maggiore
Maire Mahony
Kieran Miller
R. Nagarajan
Ravi Narayanaswami
Ray Ni
Kathy Nix
Thomas Norrie
Mark Omernick
Narayana Penukonda
Andy Phelps
Jonathan Ross
Matt Ross
Amir Salek
Emad Samadiani
Chris Severn
Gregory Sizikov
Matthew Snelham
Jed Souter
Dan Steinberg
Andy Swing
Mercedes Tan
Gregory Thorson
Bo Tian
Horia Toma
Erick Tuttle
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
ArXivPDFHTML

Papers citing "In-Datacenter Performance Analysis of a Tensor Processing Unit"

50 / 1,165 papers shown
Title
Language-agnostic Multilingual Modeling
Language-agnostic Multilingual Modeling
A. Datta
Bhuvana Ramabhadran
Jesse Emond
Anjuli Kannan
Brian Roark
24
35
0
20 Apr 2020
Local Search is a Remarkably Strong Baseline for Neural Architecture
  Search
Local Search is a Remarkably Strong Baseline for Neural Architecture Search
T. D. Ottelander
A. Dushatskiy
M. Virgolin
Peter A. N. Bosman
OOD
33
38
0
20 Apr 2020
HCM: Hardware-Aware Complexity Metric for Neural Network Architectures
HCM: Hardware-Aware Complexity Metric for Neural Network Architectures
Alex Karbachevsky
Chaim Baskin
Evgenii Zheltonozhskii
Yevgeny Yermolin
F. Gabbay
A. Bronstein
A. Mendelson
40
11
0
19 Apr 2020
Non-Blocking Simultaneous Multithreading: Embracing the Resiliency of
  Deep Neural Networks
Non-Blocking Simultaneous Multithreading: Embracing the Resiliency of Deep Neural Networks
Gil Shomron
U. Weiser
16
14
0
17 Apr 2020
Bit-Parallel Vector Composability for Neural Acceleration
Bit-Parallel Vector Composability for Neural Acceleration
Soroush Ghodrati
Hardik Sharma
C. Young
Nam Sung Kim
H. Esmaeilzadeh
MQ
6
20
0
11 Apr 2020
Reducing Data Motion to Accelerate the Training of Deep Neural Networks
Reducing Data Motion to Accelerate the Training of Deep Neural Networks
Sicong Zhuang
Cristiano Malossi
Marc Casas
27
0
0
05 Apr 2020
High Bandwidth Memory on FPGAs: A Data Analytics Perspective
High Bandwidth Memory on FPGAs: A Data Analytics Perspective
Kaan Kara
C. Hagleitner
D. Diamantopoulos
D. Syrivelis
Gustavo Alonso
24
31
0
02 Apr 2020
Improving 3D Object Detection through Progressive Population Based
  Augmentation
Improving 3D Object Detection through Progressive Population Based Augmentation
Shuyang Cheng
Zhaoqi Leng
E. D. Cubuk
Barret Zoph
Chunyan Bai
...
Vijay Vasudevan
Congcong Li
Quoc V. Le
Jonathon Shlens
Dragomir Anguelov
3DPC
20
74
0
02 Apr 2020
Overview of the IBM Neural Computer Architecture
Overview of the IBM Neural Computer Architecture
P. Narayanan
C. E. Cox
Alexis Asseman
Nicolas Antoine
Harald Huels
W. Wilcke
A. Ozcan
AI4CE
9
3
0
25 Mar 2020
GraphChallenge.org Sparse Deep Neural Network Performance
GraphChallenge.org Sparse Deep Neural Network Performance
J. Kepner
Simon Alford
V. Gadepally
Michael Jones
Lauren Milechin
Albert Reuther
Ryan A. Robinett
S. Samsi
GNN
6
11
0
25 Mar 2020
MetNet: A Neural Weather Model for Precipitation Forecasting
MetNet: A Neural Weather Model for Precipitation Forecasting
C. Sønderby
L. Espeholt
Jonathan Heek
Mostafa Dehghani
Avital Oliver
Tim Salimans
Shreya Agrawal
Jason Hickey
Nal Kalchbrenner
AI4Cl
237
273
0
24 Mar 2020
BusTime: Which is the Right Prediction Model for My Bus Arrival Time?
BusTime: Which is the Right Prediction Model for My Bus Arrival Time?
Dairui Liu
Jingxiang Sun
Shen Wang
11
9
0
20 Mar 2020
Machine Learning enabled Spectrum Sharing in Dense LTE-U/Wi-Fi
  Coexistence Scenarios
Machine Learning enabled Spectrum Sharing in Dense LTE-U/Wi-Fi Coexistence Scenarios
Adam Dziedzic
V. Sathya
M. I. Rochman
M. Ghosh
S. Krishnan
24
19
0
18 Mar 2020
Developing a Recommendation Benchmark for MLPerf Training and Inference
Developing a Recommendation Benchmark for MLPerf Training and Inference
Carole-Jean Wu
Robin Burke
Ed H. Chi
Joseph Konstan
Julian McAuley
Yves Raimond
Hao Zhang
VLM
16
29
0
16 Mar 2020
LCP: A Low-Communication Parallelization Method for Fast Neural Network
  Inference in Image Recognition
LCP: A Low-Communication Parallelization Method for Fast Neural Network Inference in Image Recognition
Ramyad Hadidi
Bahar Asgari
Jiashen Cao
Younmin Bae
Da Eun Shim
Hyojong Kim
Sung-Kyu Lim
Michael S. Ryoo
Hyesoon Kim
13
1
0
13 Mar 2020
Communication-Efficient Distributed Deep Learning: A Comprehensive
  Survey
Communication-Efficient Distributed Deep Learning: A Comprehensive Survey
Zhenheng Tang
Shaoshuai Shi
Wei Wang
Bo Li
Xuming Hu
31
48
0
10 Mar 2020
Compiling Neural Networks for a Computational Memory Accelerator
Compiling Neural Networks for a Computational Memory Accelerator
K. Kourtis
M. Dazzi
Nikolas Ioannou
Tobias Grosser
Abu Sebastian
E. Eleftheriou
17
5
0
05 Mar 2020
Adaptive Verifiability-Driven Strategy for Evolutionary Approximation of
  Arithmetic Circuits
Adaptive Verifiability-Driven Strategy for Evolutionary Approximation of Arithmetic Circuits
Milan Ceska
Jiří Matyáš
Vojtěch Mrázek
Lukás Sekanina
Z. Vašíček
Tomáš Vojnar
24
4
0
05 Mar 2020
Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural
  Networks for Edge Devices
Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural Networks for Edge Devices
Byung Hoon Ahn
Jinwon Lee
J. Lin
Hsin-Pai Cheng
Jilei Hou
H. Esmaeilzadeh
76
55
0
04 Mar 2020
RNNPool: Efficient Non-linear Pooling for RAM Constrained Inference
RNNPool: Efficient Non-linear Pooling for RAM Constrained Inference
Oindrila Saha
Aditya Kusupati
H. Simhadri
Manik Varma
Prateek Jain
27
54
0
27 Feb 2020
Disentangling Adaptive Gradient Methods from Learning Rates
Disentangling Adaptive Gradient Methods from Learning Rates
Naman Agarwal
Rohan Anil
Elad Hazan
Tomer Koren
Cyril Zhang
27
34
0
26 Feb 2020
Train Large, Then Compress: Rethinking Model Size for Efficient Training
  and Inference of Transformers
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers
Zhuohan Li
Eric Wallace
Sheng Shen
Kevin Lin
Kurt Keutzer
Dan Klein
Joseph E. Gonzalez
22
148
0
26 Feb 2020
TxSim:Modeling Training of Deep Neural Networks on Resistive Crossbar
  Systems
TxSim:Modeling Training of Deep Neural Networks on Resistive Crossbar Systems
Sourjya Roy
S. Sridharan
Shubham Jain
A. Raghunathan
29
44
0
25 Feb 2020
A$^3$: Accelerating Attention Mechanisms in Neural Networks with
  Approximation
A3^33: Accelerating Attention Mechanisms in Neural Networks with Approximation
Tae Jun Ham
Sungjun Jung
Seonghak Kim
Young H. Oh
Yeonhong Park
...
Jung-Hun Park
Sanghee Lee
Kyoung Park
Jae W. Lee
D. Jeong
24
214
0
22 Feb 2020
TFApprox: Towards a Fast Emulation of DNN Approximate Hardware
  Accelerators on GPU
TFApprox: Towards a Fast Emulation of DNN Approximate Hardware Accelerators on GPU
Filip Vaverka
Vojtěch Mrázek
Z. Vašíček
Lukás Sekanina
9
34
0
21 Feb 2020
Scalable Second Order Optimization for Deep Learning
Scalable Second Order Optimization for Deep Learning
Rohan Anil
Vineet Gupta
Tomer Koren
Kevin Regan
Y. Singer
ODL
27
29
0
20 Feb 2020
Balancing Efficiency and Flexibility for DNN Acceleration via Temporal
  GPU-Systolic Array Integration
Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration
Cong Guo
Yangjie Zhou
Jingwen Leng
Yuhao Zhu
Zidong Du
Quan Chen
Chao Li
Bin Yao
Minyi Guo
14
32
0
18 Feb 2020
Marvel: A Data-centric Compiler for DNN Operators on Spatial
  Accelerators
Marvel: A Data-centric Compiler for DNN Operators on Spatial Accelerators
Prasanth Chatarasi
Hyoukjun Kwon
Natesh Raina
Saurabh Malik
Vaisakh Haridas
Angshuman Parashar
Michael Pellauer
T. Krishna
Vivek Sarkar
8
5
0
18 Feb 2020
Controlling Computation versus Quality for Neural Sequence Models
Controlling Computation versus Quality for Neural Sequence Models
Ankur Bapna
N. Arivazhagan
Orhan Firat
35
30
0
17 Feb 2020
AIBench: An Agile Domain-specific Benchmarking Methodology and an AI
  Benchmark Suite
AIBench: An Agile Domain-specific Benchmarking Methodology and an AI Benchmark Suite
Wanling Gao
Fei Tang
Jianfeng Zhan
Chuanxin Lan
Chunjie Luo
...
Gang Lu
Junchao Shao
Zhenyu Wang
Xiaoyu Wang
Hainan Ye
27
1
0
17 Feb 2020
Improving Efficiency in Neural Network Accelerator Using Operands
  Hamming Distance optimization
Improving Efficiency in Neural Network Accelerator Using Operands Hamming Distance optimization
Meng Li
Yilei Li
P. Chuang
Liangzhen Lai
Vikas Chandra
16
3
0
13 Feb 2020
Taurus: A Data Plane Architecture for Per-Packet ML
Taurus: A Data Plane Architecture for Per-Packet ML
Tushar Swamy
Alexander Rucker
M. Shahbaz
Ishan Gaur
K. Olukotun
23
82
0
12 Feb 2020
Best of Both Worlds: AutoML Codesign of a CNN and its Hardware
  Accelerator
Best of Both Worlds: AutoML Codesign of a CNN and its Hardware Accelerator
Mohamed S. Abdelfattah
Łukasz Dudziak
Thomas C. P. Chau
Royson Lee
Hyeji Kim
Nicholas D. Lane
17
80
0
11 Feb 2020
Explore, Discover and Learn: Unsupervised Discovery of State-Covering
  Skills
Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills
Victor Campos
Alexander R. Trott
Caiming Xiong
R. Socher
Xavier Giró-i-Nieto
Jordi Torres
OffRL
19
150
0
10 Feb 2020
A Spike in Performance: Training Hybrid-Spiking Neural Networks with
  Quantized Activation Functions
A Spike in Performance: Training Hybrid-Spiking Neural Networks with Quantized Activation Functions
Aaron R. Voelker
Daniel Rasmussen
C. Eliasmith
26
17
0
10 Feb 2020
Large-Scale Discrete Fourier Transform on TPUs
Large-Scale Discrete Fourier Transform on TPUs
Tianjian Lu
Yi-Fan Chen
Blake A. Hechtman
Tao Wang
John R. Anderson
18
33
0
09 Feb 2020
BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization
BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization
Milovs Nikolić
G. B. Hacene
Ciaran Bannon
Alberto Delmas Lascorz
Matthieu Courbariaux
Yoshua Bengio
Vincent Gripon
Andreas Moshovos
MQ
30
24
0
08 Feb 2020
Sensitivity Analysis in the Dupire Local Volatility Model with
  Tensorflow
Sensitivity Analysis in the Dupire Local Volatility Model with Tensorflow
Francois Belletti
Davis E. King
James Lottes
Yi-Fan Chen
John R. Anderson
18
1
0
06 Feb 2020
PolyScientist: Automatic Loop Transformations Combined with Microkernels
  for Optimization of Deep Learning Primitives
PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives
Sanket Tavarageri
A. Heinecke
Sasikanth Avancha
Gagandeep Goyal
Ramakrishna Upadrasta
Bharat Kaul
22
0
0
06 Feb 2020
Photonic tensor cores for machine learning
Photonic tensor cores for machine learning
M. Miscuglio
V. Sorger
27
147
0
01 Feb 2020
Machine learning on DNA-encoded libraries: A new paradigm for
  hit-finding
Machine learning on DNA-encoded libraries: A new paradigm for hit-finding
Kevin McCloskey
E. Sigel
S. Kearnes
L. Xue
Xia Tian
...
C. Hupp
Anthony D. Keefe
Christopher J. Mulhern
Ying Zhang
Patrick F. Riley
56
103
0
31 Jan 2020
GPU Tensor Cores for fast Arithmetic Reductions
GPU Tensor Cores for fast Arithmetic Reductions
C. Navarro
R. Carrasco
R. Barrientos
J. A. Riquelme
R. Vega
12
35
0
15 Jan 2020
The Two-Pass Softmax Algorithm
The Two-Pass Softmax Algorithm
Marat Dukhan
Artsiom Ablavatski
TPM
11
8
0
13 Jan 2020
DeepRecSys: A System for Optimizing End-To-End At-scale Neural
  Recommendation Inference
DeepRecSys: A System for Optimizing End-To-End At-scale Neural Recommendation Inference
Udit Gupta
Samuel Hsia
V. Saraph
Xiaodong Wang
Brandon Reagen
Gu-Yeon Wei
Hsien-Hsin S. Lee
David Brooks
Carole-Jean Wu
GNN
38
188
0
08 Jan 2020
A Supervised Learning Algorithm for Multilayer Spiking Neural Networks
  Based on Temporal Coding Toward Energy-Efficient VLSI Processor Design
A Supervised Learning Algorithm for Multilayer Spiking Neural Networks Based on Temporal Coding Toward Energy-Efficient VLSI Processor Design
Yusuke Sakemi
K. Morino
Takashi Morie
Kazuyuki Aihara
16
32
0
08 Jan 2020
Sparse Weight Activation Training
Sparse Weight Activation Training
Md Aamir Raihan
Tor M. Aamodt
34
73
0
07 Jan 2020
HyGCN: A GCN Accelerator with Hybrid Architecture
HyGCN: A GCN Accelerator with Hybrid Architecture
Yurui Lai
Lei Deng
Xing Hu
Ling Liang
Yujing Feng
Xiaochun Ye
Zhimin Zhang
Xiaochun Ye
Yuan Xie
GNN
33
289
0
07 Jan 2020
AutoDNNchip: An Automated DNN Chip Predictor and Builder for Both FPGAs
  and ASICs
AutoDNNchip: An Automated DNN Chip Predictor and Builder for Both FPGAs and ASICs
Pengfei Xu
Xiaofan Zhang
Cong Hao
Yang Katie Zhao
Yongan Zhang
Yue Wang
Chaojian Li
Zetong Guan
Deming Chen
Yingyan Lin
25
89
0
06 Jan 2020
Deep Representation Learning in Speech Processing: Challenges, Recent
  Advances, and Future Trends
Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends
S. Latif
R. Rana
Sara Khalifa
Raja Jurdak
Junaid Qadir
Björn W. Schuller
AI4TS
37
81
0
02 Jan 2020
RecNMP: Accelerating Personalized Recommendation with Near-Memory
  Processing
RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing
Liu Ke
Udit Gupta
Carole-Jean Wu
B. Cho
Mark Hempstead
...
Dheevatsa Mudigere
Maxim Naumov
Martin D. Schatz
M. Smelyanskiy
Xiaodong Wang
49
216
0
30 Dec 2019
Previous
123...151617...222324
Next