ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.04760
  4. Cited By
In-Datacenter Performance Analysis of a Tensor Processing Unit

In-Datacenter Performance Analysis of a Tensor Processing Unit

16 April 2017
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh Bhatia
Nan Boden
Al Borchers
Rick Boyle
Pierre-luc Cantin
Clifford Chao
Chris Clark
Jeremy Coriell
Mike Daley
Matt Dau
Jeffrey Dean
Ben Gelb
Taraneh Ghaemmaghami
Rajendra Gottipati
William Gulland
Robert Hagmann
C. Richard Ho
Doug Hogberg
John Hu
R. Hundt
Dan Hurt
Julian Ibarz
A. Jaffey
Alek Jaworski
Alexander Kaplan
Harshit Khaitan
Andy Koch
Naveen Kumar
Steve Lacy
James Laudon
James Law
Diemthu Le
Chris Leary
Zhuyuan Liu
Kyle Lucke
Alan Lundin
Gordon MacKean
Adriana Maggiore
Maire Mahony
Kieran Miller
R. Nagarajan
Ravi Narayanaswami
Ray Ni
Kathy Nix
Thomas Norrie
Mark Omernick
Narayana Penukonda
Andy Phelps
Jonathan Ross
Matt Ross
Amir Salek
Emad Samadiani
Chris Severn
Gregory Sizikov
Matthew Snelham
Jed Souter
Dan Steinberg
Andy Swing
Mercedes Tan
Gregory Thorson
Bo Tian
Horia Toma
Erick Tuttle
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
ArXiv (abs)PDFHTML

Papers citing "In-Datacenter Performance Analysis of a Tensor Processing Unit"

50 / 1,167 papers shown
Title
Towards Memory-Efficient Neural Networks via Multi-Level in situ
  Generation
Towards Memory-Efficient Neural Networks via Multi-Level in situ Generation
Jiaqi Gu
Hanqing Zhu
Chenghao Feng
Mingjie Liu
Zixuan Jiang
Ray T. Chen
David Z. Pan
44
4
0
25 Aug 2021
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
Zirui Wang
Jiahui Yu
Adams Wei Yu
Zihang Dai
Yulia Tsvetkov
Yuan Cao
VLMMLLM
161
799
0
24 Aug 2021
DeepEdgeBench: Benchmarking Deep Neural Networks on Edge Devices
DeepEdgeBench: Benchmarking Deep Neural Networks on Edge Devices
Stephan Patrick Baller
Anshul Jindal
Mohak Chadha
Michael Gerndt
41
73
0
21 Aug 2021
Understanding Data Storage and Ingestion for Large-Scale Deep
  Recommendation Model Training
Understanding Data Storage and Ingestion for Large-Scale Deep Recommendation Model Training
Mark Zhao
Niket Agarwal
Aarti Basant
B. Gedik
Satadru Pan
...
Kevin Wilfong
Harsha Rastogi
Carole-Jean Wu
Christos Kozyrakis
Parikshit Pol
GNN
84
76
0
20 Aug 2021
Edge AI without Compromise: Efficient, Versatile and Accurate
  Neurocomputing in Resistive Random-Access Memory
Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory
W. Wan
Rajkumar Kubendran
Clemens J. S. Schaefer
S. Eryilmaz
Wenqiang Zhang
...
B. Gao
Siddharth Joshi
Huaqiang Wu
H. P. Wong
Gert Cauwenberghs
43
12
0
17 Aug 2021
AIRCHITECT: Learning Custom Architecture Design and Mapping Space
AIRCHITECT: Learning Custom Architecture Design and Mapping Space
A. Samajdar
J. Joseph
Matthew Denton
T. Krishna
125
7
0
16 Aug 2021
LayerPipe: Accelerating Deep Neural Network Training by Intra-Layer and
  Inter-Layer Gradient Pipelining and Multiprocessor Scheduling
LayerPipe: Accelerating Deep Neural Network Training by Intra-Layer and Inter-Layer Gradient Pipelining and Multiprocessor Scheduling
Nanda K. Unnikrishnan
Keshab K. Parhi
AI4CE
37
6
0
14 Aug 2021
A Distributed SGD Algorithm with Global Sketching for Deep Learning
  Training Acceleration
A Distributed SGD Algorithm with Global Sketching for Deep Learning Training Acceleration
Lingfei Dai
Boyu Diao
Chao Li
Yongjun Xu
68
5
0
13 Aug 2021
From Domain-Specific Languages to Memory-Optimized Accelerators for
  Fluid Dynamics
From Domain-Specific Languages to Memory-Optimized Accelerators for Fluid Dynamics
Karl F. A. Friebel
Stephanie Soldavini
G. Hempel
C. Pilato
J. Castrillón
74
9
0
06 Aug 2021
BEANNA: A Binary-Enabled Architecture for Neural Network Acceleration
BEANNA: A Binary-Enabled Architecture for Neural Network Acceleration
C. Terrill
Fred Chu
MQ
39
0
0
04 Aug 2021
Large-Scale Differentially Private BERT
Large-Scale Differentially Private BERT
Rohan Anil
Badih Ghazi
Vineet Gupta
Ravi Kumar
Pasin Manurangsi
93
139
0
03 Aug 2021
Data Streaming and Traffic Gathering in Mesh-based NoC for Deep Neural
  Network Acceleration
Data Streaming and Traffic Gathering in Mesh-based NoC for Deep Neural Network Acceleration
Binayak Tiwari
Mei Yang
Xiaohang Wang
Yingtao Jiang
GNN
14
3
0
01 Aug 2021
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Andrew Jaegle
Sebastian Borgeaud
Jean-Baptiste Alayrac
Carl Doersch
Catalin Ionescu
...
Olivier J. Hénaff
M. Botvinick
Andrew Zisserman
Oriol Vinyals
João Carreira
MLLMVLMGNN
151
585
0
30 Jul 2021
Towards Efficient Tensor Decomposition-Based DNN Model Compression with
  Optimization Framework
Towards Efficient Tensor Decomposition-Based DNN Model Compression with Optimization Framework
Miao Yin
Yang Sui
Siyu Liao
Bo Yuan
60
81
0
26 Jul 2021
CREW: Computation Reuse and Efficient Weight Storage for
  Hardware-accelerated MLPs and RNNs
CREW: Computation Reuse and Efficient Weight Storage for Hardware-accelerated MLPs and RNNs
Marc Riera
J. Arnau
Antonio González
31
5
0
20 Jul 2021
Positive/Negative Approximate Multipliers for DNN Accelerators
Positive/Negative Approximate Multipliers for DNN Accelerators
Ourania Spantidi
Georgios Zervakis
Iraklis Anagnostopoulos
H. Amrouch
J. Henkel
45
19
0
20 Jul 2021
AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning
AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning
Young Geun Kim
Carole-Jean Wu
102
87
0
16 Jul 2021
S2TA: Exploiting Structured Sparsity for Energy-Efficient Mobile CNN
  Acceleration
S2TA: Exploiting Structured Sparsity for Energy-Efficient Mobile CNN Acceleration
Zhi-Gang Liu
P. Whatmough
Yuhao Zhu
Matthew Mattina
MQ
90
81
0
16 Jul 2021
FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks
FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks
Sheng-Chun Kao
Suvinay Subramanian
Gaurav Agrawal
Amir Yazdanbakhsh
T. Krishna
130
64
0
13 Jul 2021
ROBIN: A Robust Optical Binary Neural Network Accelerator
ROBIN: A Robust Optical Binary Neural Network Accelerator
Febin P. Sunny
Asif Mirza
Mahdi Nikdast
S. Pasricha
MQ
69
36
0
12 Jul 2021
Structured Model Pruning of Convolutional Networks on Tensor Processing
  Units
Structured Model Pruning of Convolutional Networks on Tensor Processing Units
Kongtao Chen
Ken Franko
Ruoxin Sang
CVBM
38
59
0
09 Jul 2021
Tensor Methods in Computer Vision and Deep Learning
Tensor Methods in Computer Vision and Deep Learning
Yannis Panagakis
Jean Kossaifi
Grigorios G. Chrysos
James Oldfield
M. Nicolaou
Anima Anandkumar
Stefanos Zafeiriou
62
126
0
07 Jul 2021
Simple Training Strategies and Model Scaling for Object Detection
Simple Training Strategies and Model Scaling for Object Detection
Xianzhi Du
Barret Zoph
Wei-Chih Hung
Nayeon Lee
ObjD
111
41
0
30 Jun 2021
Multimodal Few-Shot Learning with Frozen Language Models
Multimodal Few-Shot Learning with Frozen Language Models
Maria Tsimpoukelli
Jacob Menick
Serkan Cabi
S. M. Ali Eslami
Oriol Vinyals
Felix Hill
MLLM
242
793
0
25 Jun 2021
A Construction Kit for Efficient Low Power Neural Network Accelerator
  Designs
A Construction Kit for Efficient Low Power Neural Network Accelerator Designs
Petar Jokic
E. Azarkhish
Andrea Bonetti
M. Pons
S. Emery
Luca Benini
66
4
0
24 Jun 2021
APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU
  Tensor Cores
APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores
Boyuan Feng
Yuke Wang
Tong Geng
Ang Li
Yufei Ding
MQ
74
37
0
23 Jun 2021
NAX: Co-Designing Neural Network and Hardware Architecture for
  Memristive Xbar based Computing Systems
NAX: Co-Designing Neural Network and Hardware Architecture for Memristive Xbar based Computing Systems
Shubham Negi
I. Chakraborty
Aayush Ankit
Kaushik Roy
34
6
0
23 Jun 2021
Randomness In Neural Network Training: Characterizing The Impact of
  Tooling
Randomness In Neural Network Training: Characterizing The Impact of Tooling
Donglin Zhuang
Xingyao Zhang
Shuaiwen Leon Song
Sara Hooker
87
78
0
22 Jun 2021
Boggart: Towards General-Purpose Acceleration of Retrospective Video
  Analytics
Boggart: Towards General-Purpose Acceleration of Retrospective Video Analytics
Neil Agarwal
Ravi Netravali
95
15
0
21 Jun 2021
Dive into Deep Learning
Dive into Deep Learning
Aston Zhang
Zachary Chase Lipton
Mu Li
Alexander J. Smola
VLM
104
572
0
21 Jun 2021
How to Reach Real-Time AI on Consumer Devices? Solutions for
  Programmable and Custom Architectures
How to Reach Real-Time AI on Consumer Devices? Solutions for Programmable and Custom Architectures
Stylianos I. Venieris
Ioannis Panopoulos
Ilias Leontiadis
I. Venieris
84
6
0
21 Jun 2021
Multiplying Matrices Without Multiplying
Multiplying Matrices Without Multiplying
Davis W. Blalock
John Guttag
120
52
0
21 Jun 2021
A Survey on Serverless Computing
A Survey on Serverless Computing
J. John
Shashank Gupta
50
64
0
20 Jun 2021
Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix
  Multiplication
Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication
G. Moon
Hyoukjun Kwon
Geonhwa Jeong
Prasanth Chatarasi
S. Rajamanickam
T. Krishna
37
28
0
19 Jun 2021
FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for
  Mixed-signal DNN Accelerator
FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator
Geng Yuan
Payman Behnam
Zhengang Li
Ali Shafiee
Sheng Lin
...
Hang Liu
Xuehai Qian
M. N. Bojnordi
Yanzhi Wang
Caiwen Ding
56
68
0
16 Jun 2021
Automatic Construction of Evaluation Suites for Natural Language
  Generation Datasets
Automatic Construction of Evaluation Suites for Natural Language Generation Datasets
Simon Mille
Kaustubh D. Dhole
Saad Mahamood
Laura Perez-Beltrachini
Varun Gangal
Mihir Kale
Emiel van Miltenburg
Sebastian Gehrmann
ELM
87
23
0
16 Jun 2021
Efficient Deep Learning: A Survey on Making Deep Learning Models
  Smaller, Faster, and Better
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
Gaurav Menghani
VLMMedIm
110
391
0
16 Jun 2021
Scene Transformer: A unified architecture for predicting multiple agent
  trajectories
Scene Transformer: A unified architecture for predicting multiple agent trajectories
Jiquan Ngiam
Benjamin Caine
Vijay Vasudevan
Zhengdong Zhang
H. Chiang
...
Ashish Venugopal
David J. Weiss
Benjamin Sapp
Zhifeng Chen
Jonathon Shlens
129
168
0
15 Jun 2021
ShortcutFusion: From Tensorflow to FPGA-based accelerator with
  reuse-aware memory allocation for shortcut data
ShortcutFusion: From Tensorflow to FPGA-based accelerator with reuse-aware memory allocation for shortcut data
Duy-Thanh Nguyen
Hyeonseung Je
Tuan Nghia Nguyen
Soojung Ryu
Kyujoong Lee
Hyuk-Jae Lee
53
30
0
15 Jun 2021
S2Engine: A Novel Systolic Architecture for Sparse Convolutional Neural
  Networks
S2Engine: A Novel Systolic Architecture for Sparse Convolutional Neural Networks
Jianlei Yang
Wenzhi Fu
Xingzhou Cheng
Xucheng Ye
Pengcheng Dai
Weisheng Zhao
41
8
0
15 Jun 2021
Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators
Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators
Yonggan Fu
Yongan Zhang
Yang Zhang
David D. Cox
Yingyan Lin
MQ
125
18
0
11 Jun 2021
Knowledge distillation: A good teacher is patient and consistent
Knowledge distillation: A good teacher is patient and consistent
Lucas Beyer
Xiaohua Zhai
Amelie Royer
L. Markeeva
Rohan Anil
Alexander Kolesnikov
VLM
109
302
0
09 Jun 2021
HASI: Hardware-Accelerated Stochastic Inference, A Defense Against
  Adversarial Machine Learning Attacks
HASI: Hardware-Accelerated Stochastic Inference, A Defense Against Adversarial Machine Learning Attacks
Mohammad Hossein Samavatian
Saikat Majumdar
Kristin Barber
R. Teodorescu
AAML
121
4
0
09 Jun 2021
Dynamic Sparse Training for Deep Reinforcement Learning
Dynamic Sparse Training for Deep Reinforcement Learning
Ghada Sokar
Elena Mocanu
Decebal Constantin Mocanu
Mykola Pechenizkiy
Peter Stone
108
60
0
08 Jun 2021
Top-KAST: Top-K Always Sparse Training
Top-KAST: Top-K Always Sparse Training
Siddhant M. Jayakumar
Razvan Pascanu
Jack W. Rae
Simon Osindero
Erich Elsen
184
100
0
07 Jun 2021
JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale
  Online Inference at Baidu
JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu
Hao Liu
Qian Gao
Jiang Li
X. Liao
Hao Xiong
...
Guobao Yang
Zhiwei Zha
Daxiang Dong
Dejing Dou
Haoyi Xiong
VLM
85
22
0
03 Jun 2021
Machine Learning for Security in Vehicular Networks: A Comprehensive
  Survey
Machine Learning for Security in Vehicular Networks: A Comprehensive Survey
Anum Talpur
M. Gurusamy
53
63
0
31 May 2021
Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with
  Rank Reordering
Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with Rank Reordering
Liang Luo
Jacob Nelson
Arvind Krishnamurthy
Luis Ceze
235
1
0
28 May 2021
Bridging Data Center AI Systems with Edge Computing for Actionable
  Information Retrieval
Bridging Data Center AI Systems with Edge Computing for Actionable Information Retrieval
Zhengchun Liu
Ahsan Ali
Peter Kenesei
Antonino Miceli
Hemant Sharma
...
Naoufal Layad
Jana Thayer
R. Herbst
Chun Hong Yoon
Ian Foster
65
23
0
28 May 2021
FuSeConv: Fully Separable Convolutions for Fast Inference on Systolic
  Arrays
FuSeConv: Fully Separable Convolutions for Fast Inference on Systolic Arrays
Surya Selvam
Vinod Ganesan
Pratyush Kumar
58
10
0
27 May 2021
Previous
123...91011...222324
Next