ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.04760
  4. Cited By
In-Datacenter Performance Analysis of a Tensor Processing Unit

In-Datacenter Performance Analysis of a Tensor Processing Unit

16 April 2017
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh Bhatia
Nan Boden
Al Borchers
Rick Boyle
Pierre-luc Cantin
Clifford Chao
Chris Clark
Jeremy Coriell
Mike Daley
Matt Dau
Jeffrey Dean
Ben Gelb
Taraneh Ghaemmaghami
Rajendra Gottipati
William Gulland
Robert Hagmann
C. Richard Ho
Doug Hogberg
John Hu
R. Hundt
Dan Hurt
Julian Ibarz
A. Jaffey
Alek Jaworski
Alexander Kaplan
Harshit Khaitan
Andy Koch
Naveen Kumar
Steve Lacy
James Laudon
James Law
Diemthu Le
Chris Leary
Zhuyuan Liu
Kyle Lucke
Alan Lundin
Gordon MacKean
Adriana Maggiore
Maire Mahony
Kieran Miller
R. Nagarajan
Ravi Narayanaswami
Ray Ni
Kathy Nix
Thomas Norrie
Mark Omernick
Narayana Penukonda
Andy Phelps
Jonathan Ross
Matt Ross
Amir Salek
Emad Samadiani
Chris Severn
Gregory Sizikov
Matthew Snelham
Jed Souter
Dan Steinberg
Andy Swing
Mercedes Tan
Gregory Thorson
Bo Tian
Horia Toma
Erick Tuttle
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
ArXivPDFHTML

Papers citing "In-Datacenter Performance Analysis of a Tensor Processing Unit"

50 / 1,165 papers shown
Title
Swift for TensorFlow: A portable, flexible platform for deep learning
Swift for TensorFlow: A portable, flexible platform for deep learning
Brennan Saeta
Denys Shabalin
M. Rasi
Brad Larson
Xihui Wu
...
Saleem Abdulrasool
A. Efremov
Dave Abrahams
Chris Lattner
Richard Wei
HAI
32
11
0
26 Feb 2021
LogME: Practical Assessment of Pre-trained Models for Transfer Learning
LogME: Practical Assessment of Pre-trained Models for Transfer Learning
Kaichao You
Yong Liu
Jianmin Wang
Mingsheng Long
35
178
0
22 Feb 2021
An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks
An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks
K. Seshadri
Berkin Akin
James Laudon
Ravi Narayanaswami
Amir Yazdanbakhsh
35
118
0
20 Feb 2021
Control Variate Approximation for DNN Accelerators
Control Variate Approximation for DNN Accelerators
Georgios Zervakis
Ourania Spantidi
Iraklis Anagnostopoulos
H. Amrouch
J. Henkel
BDL
34
22
0
18 Feb 2021
Combinatorial optimization and reasoning with graph neural networks
Combinatorial optimization and reasoning with graph neural networks
Quentin Cappart
Didier Chételat
Elias Boutros Khalil
Andrea Lodi
Christopher Morris
Petar Velickovic
AI4CE
37
352
0
18 Feb 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
281
179
0
17 Feb 2021
A Survey of Machine Learning for Computer Architecture and Systems
A Survey of Machine Learning for Computer Architecture and Systems
Nan Wu
Yuan Xie
AI4TS
AI4CE
25
145
0
16 Feb 2021
GradPIM: A Practical Processing-in-DRAM Architecture for Gradient
  Descent
GradPIM: A Practical Processing-in-DRAM Architecture for Gradient Descent
Heesu Kim
Hanmin Park
Taehyun Kim
Kwanheum Cho
Eojin Lee
Soojung Ryu
Hyuk-Jae Lee
Kiyoung Choi
Jinho Lee
24
36
0
15 Feb 2021
CrossLight: A Cross-Layer Optimized Silicon Photonic Neural Network
  Accelerator
CrossLight: A Cross-Layer Optimized Silicon Photonic Neural Network Accelerator
Febin P. Sunny
Asif Mirza
Mahdi Nikdast
S. Pasricha
11
70
0
13 Feb 2021
Discovery of Options via Meta-Learned Subgoals
Discovery of Options via Meta-Learned Subgoals
Vivek Veeriah
Tom Zahavy
Matteo Hessel
Zhongwen Xu
Junhyuk Oh
Iurii Kemaev
H. V. Hasselt
David Silver
Satinder Singh
29
33
0
12 Feb 2021
A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers
  Suffice Across Batch Sizes
A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers Suffice Across Batch Sizes
Zachary Nado
Justin M. Gilmer
Christopher J. Shallue
Rohan Anil
George E. Dahl
ODL
30
27
0
12 Feb 2021
Temporal Parallelization of Inference in Hidden Markov Models
Temporal Parallelization of Inference in Hidden Markov Models
S. S. Hassan
Simo Särkkä
Á. F. García-Fernández
TPM
11
12
0
10 Feb 2021
Searching for Fast Model Families on Datacenter Accelerators
Searching for Fast Model Families on Datacenter Accelerators
Sheng Li
Mingxing Tan
Ruoming Pang
Andrew Li
Liqun Cheng
Quoc V. Le
N. Jouppi
44
34
0
10 Feb 2021
Colorization Transformer
Colorization Transformer
Manoj Kumar
Dirk Weissenborn
Nal Kalchbrenner
ViT
232
158
0
08 Feb 2021
Horizontally Fused Training Array: An Effective Hardware Utilization
  Squeezer for Training Novel Deep Learning Models
Horizontally Fused Training Array: An Effective Hardware Utilization Squeezer for Training Novel Deep Learning Models
Shang Wang
Peiming Yang
Yuxuan Zheng
Xuelong Li
Gennady Pekhimenko
16
22
0
03 Feb 2021
Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video
  Analytics Pipelines
Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video Analytics Pipelines
Francisco Romero
Mark Zhao
N. Yadwadkar
Christos Kozyrakis
35
101
0
03 Feb 2021
Truly Sparse Neural Networks at Scale
Truly Sparse Neural Networks at Scale
Selima Curci
Decebal Constantin Mocanu
Mykola Pechenizkiy
45
19
0
02 Feb 2021
A Runtime-Based Computational Performance Predictor for Deep Neural
  Network Training
A Runtime-Based Computational Performance Predictor for Deep Neural Network Training
Geoffrey X. Yu
Yubo Gao
P. Golikov
Gennady Pekhimenko
3DH
36
67
0
31 Jan 2021
Parallel Iterated Extended and Sigma-point Kalman Smoothers
Parallel Iterated Extended and Sigma-point Kalman Smoothers
F. Yaghoobi
Adrien Corenflos
Sakira Hassan
Simo Särkkä
17
13
0
31 Jan 2021
A Competitive Edge: Can FPGAs Beat GPUs at DCNN Inference Acceleration
  in Resource-Limited Edge Computing Applications?
A Competitive Edge: Can FPGAs Beat GPUs at DCNN Inference Acceleration in Resource-Limited Edge Computing Applications?
Ian Colbert
Jake Daly
Ken Kreutz-Delgado
Srinjoy Das
20
13
0
30 Jan 2021
Rethinking Floating Point Overheads for Mixed Precision DNN Accelerators
Rethinking Floating Point Overheads for Mixed Precision DNN Accelerators
Hamzah Abdel-Aziz
Ali Shafiee
J. Shin
A. Pedram
Joseph Hassoun
MQ
42
10
0
27 Jan 2021
TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
Chunxing Yin
Bilge Acun
Xing Liu
Carole-Jean Wu
50
103
0
25 Jan 2021
AdderNet and its Minimalist Hardware Design for Energy-Efficient
  Artificial Intelligence
AdderNet and its Minimalist Hardware Design for Energy-Efficient Artificial Intelligence
Yunhe Wang
Mingqiang Huang
Kai Han
Hanting Chen
Wei Zhang
Chunjing Xu
Dacheng Tao
53
34
0
25 Jan 2021
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Tailin Liang
C. Glossner
Lei Wang
Shaobo Shi
Xiaotong Zhang
MQ
150
678
0
24 Jan 2021
MinConvNets: A new class of multiplication-less Neural Networks
MinConvNets: A new class of multiplication-less Neural Networks
Xuecan Yang
S. Chaudhuri
Laurence Likforman
L. Naviner
19
0
0
23 Jan 2021
Direct Spatial Implementation of Sparse Matrix Multipliers for Reservoir
  Computing
Direct Spatial Implementation of Sparse Matrix Multipliers for Reservoir Computing
Matthew Denton
H. Schmit
13
2
0
21 Jan 2021
Clairvoyant Prefetching for Distributed Machine Learning I/O
Clairvoyant Prefetching for Distributed Machine Learning I/O
Nikoli Dryden
Roman Böhringer
Tal Ben-Nun
Torsten Hoefler
36
57
0
21 Jan 2021
Accelerating Deep Learning Inference via Learned Caches
Accelerating Deep Learning Inference via Learned Caches
Arjun Balasubramanian
Adarsh Kumar
Yuhan Liu
Han Cao
Shivaram Venkataraman
Aditya Akella
28
18
0
18 Jan 2021
NNStreamer: Efficient and Agile Development of On-Device AI Systems
NNStreamer: Efficient and Agile Development of On-Device AI Systems
MyungJoo Ham
Jijoong Moon
Geunsik Lim
Jaeyun Jung
Hyoungjoo Ahn
...
Parichay Kapoor
Dongju Chae
Gichan Jang
Y. Ahn
Jihoon Lee
28
6
0
16 Jan 2021
STENCIL-NET: Data-driven solution-adaptive discretization of partial
  differential equations
STENCIL-NET: Data-driven solution-adaptive discretization of partial differential equations
Suryanarayana Maddu
D. Sturm
B. Cheeseman
Christian L. Müller
I. Sbalzarini
32
8
0
15 Jan 2021
Self-Adaptive Reconfigurable Arrays (SARA): Using ML to Assist Scaling
  GEMM Acceleration
Self-Adaptive Reconfigurable Arrays (SARA): Using ML to Assist Scaling GEMM Acceleration
A. Samajdar
Michael Pellauer
T. Krishna
22
4
0
12 Jan 2021
TensorX: Extensible API for Neural Network Model Design and Deployment
TensorX: Extensible API for Neural Network Model Design and Deployment
Davide Nunes
Luis M. Antunes
17
0
0
29 Dec 2020
SimBricks: End-to-End Network System Evaluation with Modular Simulation
SimBricks: End-to-End Network System Evaluation with Modular Simulation
Hejing Li
Jialin Li
Antoine Kaufmann
9
20
0
28 Dec 2020
Assured RL: Reinforcement Learning with Almost Sure Constraints
Assured RL: Reinforcement Learning with Almost Sure Constraints
Agustin Castellano
J. Bazerque
Enrique Mallada
16
1
0
24 Dec 2020
AutonoML: Towards an Integrated Framework for Autonomous Machine
  Learning
AutonoML: Towards an Integrated Framework for Autonomous Machine Learning
D. Kedziora
Katarzyna Musial
Bogdan Gabrys
30
16
0
23 Dec 2020
Hardware and Software Optimizations for Accelerating Deep Neural
  Networks: Survey of Current Trends, Challenges, and the Road Ahead
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead
Maurizio Capra
Beatrice Bussolino
Alberto Marchisio
Guido Masera
Maurizio Martina
Mohamed Bennai
BDL
64
140
0
21 Dec 2020
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and
  Head Pruning
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Hanrui Wang
Zhekai Zhang
Song Han
48
380
0
17 Dec 2020
Real-time Multi-Task Diffractive Deep Neural Networks via
  Hardware-Software Co-design
Real-time Multi-Task Diffractive Deep Neural Networks via Hardware-Software Co-design
Yingjie Li
Ruiyang Chen
B. S. Rodriguez
Weilu Gao
Cunxi Yu
35
4
0
16 Dec 2020
A hybrid quantum-classical neural network with deep residual learning
A hybrid quantum-classical neural network with deep residual learning
Yanying Liang
Wei Peng
Zhu-Jun Zheng
Olli Silvén
Guoying Zhao
28
45
0
14 Dec 2020
Neighbors From Hell: Voltage Attacks Against Deep Learning Accelerators
  on Multi-Tenant FPGAs
Neighbors From Hell: Voltage Attacks Against Deep Learning Accelerators on Multi-Tenant FPGAs
Andrew Boutros
Mathew Hall
Nicolas Papernot
Vaughn Betz
19
38
0
14 Dec 2020
Less Is More: Improved RNN-T Decoding Using Limited Label Context and
  Path Merging
Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging
Rohit Prabhavalkar
Yanzhang He
David Rybach
S. Campbell
A. Narayanan
Trevor Strohman
Tara N. Sainath
52
35
0
12 Dec 2020
Hardware Beyond Backpropagation: a Photonic Co-Processor for Direct
  Feedback Alignment
Hardware Beyond Backpropagation: a Photonic Co-Processor for Direct Feedback Alignment
Julien Launay
Iacopo Poli
Kilian Muller
Gustave Pariente
I. Carron
L. Daudet
Florent Krzakala
S. Gigan
MoE
20
18
0
11 Dec 2020
Imitating Interactive Intelligence
Imitating Interactive Intelligence
Josh Abramson
Arun Ahuja
Iain Barr
Arthur Brussee
Federico Carnevale
...
Greg Wayne
Duncan Williams
Nathaniel Wong
Chen Yan
Rui Zhu
LM&Ro
24
71
0
10 Dec 2020
The Why, What and How of Artificial General Intelligence Chip
  Development
The Why, What and How of Artificial General Intelligence Chip Development
Alex P. James
27
20
0
08 Dec 2020
Real-Time Formal Verification of Autonomous Systems With An FPGA
Real-Time Formal Verification of Autonomous Systems With An FPGA
Minh Bui
Michael Lu
Reza Hojabr
Mo Chen
Arrvindh Shriraman
9
4
0
07 Dec 2020
Monadic Pavlovian associative learning in a backpropagation-free
  photonic network
Monadic Pavlovian associative learning in a backpropagation-free photonic network
James Y. S. Tan
Zengguang Cheng
J. Feldmann
Xuan Li
Nathan Youngblood
U. E. Ali
David Wright
W. Pernice
H. Bhaskaran
21
13
0
30 Nov 2020
EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware
  Multi-Task NLP Inference
EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference
Thierry Tambe
Coleman Hooper
Lillian Pentecost
Tianyu Jia
En-Yu Yang
...
Victor Sanh
P. Whatmough
Alexander M. Rush
David Brooks
Gu-Yeon Wei
20
117
0
28 Nov 2020
Ax-BxP: Approximate Blocked Computation for Precision-Reconfigurable
  Deep Neural Network Acceleration
Ax-BxP: Approximate Blocked Computation for Precision-Reconfigurable Deep Neural Network Acceleration
Reena Elangovan
Shubham Jain
A. Raghunathan
17
7
0
25 Nov 2020
Bringing AI To Edge: From Deep Learning's Perspective
Bringing AI To Edge: From Deep Learning's Perspective
Di Liu
Hao Kong
Xiangzhong Luo
Weichen Liu
Ravi Subramaniam
54
116
0
25 Nov 2020
End-to-End Framework for Efficient Deep Learning Using Metasurfaces
  Optics
End-to-End Framework for Efficient Deep Learning Using Metasurfaces Optics
Carlos Mauricio Villegas Burgos
Tianqi Yang
Nick Vamivakas
Yuhao Zhu
22
0
0
23 Nov 2020
Previous
123...111213...222324
Next