Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1704.04760
Cited By
In-Datacenter Performance Analysis of a Tensor Processing Unit
16 April 2017
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh Bhatia
Nan Boden
Al Borchers
Rick Boyle
Pierre-luc Cantin
Clifford Chao
Chris Clark
Jeremy Coriell
Mike Daley
Matt Dau
Jeffrey Dean
Ben Gelb
Taraneh Ghaemmaghami
Rajendra Gottipati
William Gulland
Robert Hagmann
C. Richard Ho
Doug Hogberg
John Hu
R. Hundt
Dan Hurt
Julian Ibarz
A. Jaffey
Alek Jaworski
Alexander Kaplan
Harshit Khaitan
Andy Koch
Naveen Kumar
Steve Lacy
James Laudon
James Law
Diemthu Le
Chris Leary
Zhuyuan Liu
Kyle Lucke
Alan Lundin
Gordon MacKean
Adriana Maggiore
Maire Mahony
Kieran Miller
R. Nagarajan
Ravi Narayanaswami
Ray Ni
Kathy Nix
Thomas Norrie
Mark Omernick
Narayana Penukonda
Andy Phelps
Jonathan Ross
Matt Ross
Amir Salek
Emad Samadiani
Chris Severn
Gregory Sizikov
Matthew Snelham
Jed Souter
Dan Steinberg
Andy Swing
Mercedes Tan
Gregory Thorson
Bo Tian
Horia Toma
Erick Tuttle
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
Re-assign community
ArXiv
PDF
HTML
Papers citing
"In-Datacenter Performance Analysis of a Tensor Processing Unit"
50 / 1,165 papers shown
Title
A Full-Stack Search Technique for Domain Optimized Deep Learning Accelerators
Dan Zhang
Safeen Huda
Ebrahim M. Songhori
Kartik Prabhu
Quoc V. Le
Anna Goldie
Azalia Mirhoseini
34
51
0
26 May 2021
Low-Precision Hardware Architectures Meet Recommendation Model Inference at Scale
Zhaoxia Deng
Deng
Jongsoo Park
P. T. P. Tang
Haixin Liu
...
S. Nadathur
Changkyu Kim
Maxim Naumov
S. Naghshineh
M. Smelyanskiy
29
11
0
26 May 2021
FENXI: Deep-learning Traffic Analytics at the Edge
Massimo Gallo
A. Finamore
G. Simon
Dario Rossi
14
7
0
25 May 2021
GNNIE: GNN Inference Engine with Load-balancing and Graph-Specific Caching
Sudipta Mondal
Susmita Dey Manasi
K. Kunal
S. Ramprasath
S. Sapatnekar
GNN
6
15
0
21 May 2021
RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance
Udit Gupta
Samuel Hsia
J. Zhang
Mark Wilkening
Javin Pombra
Hsien-Hsin S. Lee
Gu-Yeon Wei
Carole-Jean Wu
David Brooks
41
32
0
18 May 2021
SimNet: Accurate and High-Performance Computer Architecture Simulation using Deep Learning
Lingda Li
Santosh Pandey
T. Flynn
Hang Liu
Noel Wheeler
A. Hoisie
32
7
0
12 May 2021
PIM-DRAM: Accelerating Machine Learning Workloads using Processing in Commodity DRAM
Sourjya Roy
M. Ali
A. Raghunathan
19
19
0
08 May 2021
Neural network architectures using min-plus algebra for solving certain high dimensional optimal control problems and Hamilton-Jacobi PDEs
Jérome Darbon
P. Dower
Tingwei Meng
19
22
0
07 May 2021
CoSA: Scheduling by Constrained Optimization for Spatial Accelerators
Qijing Huang
Minwoo Kang
Grace Dinh
Thomas Norell
Aravind Kalaiah
J. Demmel
J. Wawrzynek
Y. Shao
23
108
0
05 May 2021
Modulating Regularization Frequency for Efficient Compression-Aware Model Training
Dongsoo Lee
S. Kwon
Byeongwook Kim
Jeongin Yun
Baeseong Park
Yongkweon Jeon
27
0
0
05 May 2021
HASCO: Towards Agile HArdware and Software CO-design for Tensor Computation
Qingcheng Xiao
Wenlei Bao
Bingzhe Wu
Pengcheng Xu
Xuehai Qian
Yun Liang
45
67
0
04 May 2021
Connecting AI Learning and Blockchain Mining in 6G Systems
Yunkai Wei
Zixian An
S. Leng
Kun Yang
15
1
0
29 Apr 2021
MAGMA: An Optimization Framework for Mapping Multiple DNNs on Multiple Accelerator Cores
Sheng-Chun Kao
T. Krishna
24
51
0
28 Apr 2021
An optical neural network using less than 1 photon per multiplication
Tianyu Wang
Shifan Ma
Logan G. Wright
Tatsuhiro Onodera
Brian C. Richard
Peter L. McMahon
55
177
0
27 Apr 2021
Efficient training of physics-informed neural networks via importance sampling
M. A. Nabian
R. J. Gladstone
Hadi Meidani
DiffM
PINN
75
225
0
26 Apr 2021
Measuring what Really Matters: Optimizing Neural Networks for TinyML
Lennart Heim
Andreas Biri
Zhongnan Qu
Lothar Thiele
56
30
0
21 Apr 2021
VideoGPT: Video Generation using VQ-VAE and Transformers
Wilson Yan
Yunzhi Zhang
Pieter Abbeel
A. Srinivas
ViT
VGen
245
487
0
20 Apr 2021
DynO: Dynamic Onloading of Deep Neural Networks from Cloud to Device
Mario Almeida
Stefanos Laskaridis
Stylianos I. Venieris
Ilias Leontiadis
Nicholas D. Lane
24
36
0
20 Apr 2021
CoDR: Computation and Data Reuse Aware CNN Accelerator
Alireza Khadem
Haojie Ye
T. Mudge
6
0
0
20 Apr 2021
End-to-End Jet Classification of Boosted Top Quarks with the CMS Open Data
Michael Andrews
Bjorn Burkle
Yi-fan Chen
Davide DiCroce
S. Gleyzer
...
N. Pervan
Yusef Shafi
Wei-Ju Sun
Emanuele Usai
Kun Yang
23
10
0
19 Apr 2021
Arithmetic-Intensity-Guided Fault Tolerance for Neural Network Inference on GPUs
J. Kosaian
K. V. Rashmi
38
33
0
19 Apr 2021
Learning on Hardware: A Tutorial on Neural Network Accelerators and Co-Processors
Lukas Baischer
M. Wess
N. Taherinejad
28
12
0
19 Apr 2021
RingCNN: Exploiting Algebraically-Sparse Ring Tensors for Energy-Efficient CNN-Based Computational Imaging
Chao-Tsung Huang
45
10
0
19 Apr 2021
Demystifying BERT: Implications for Accelerator Design
Suchita Pati
Shaizeen Aga
Nuwan Jayasena
Matthew D. Sinclair
LLMAG
40
17
0
14 Apr 2021
Mitigating Adversarial Attack for Compute-in-Memory Accelerator Utilizing On-chip Finetune
Shanshi Huang
Hongwu Jiang
Shimeng Yu
AAML
26
3
0
13 Apr 2021
Podracer architectures for scalable Reinforcement Learning
Matteo Hessel
M. Kroiss
Aidan Clark
Iurii Kemaev
John Quan
Thomas Keck
Fabio Viola
H. V. Hasselt
24
38
0
13 Apr 2021
Optimizing the Whole-life Cost in End-to-end CNN Acceleration
Jiaqi Zhang
Xiangru Chen
S. Ray
8
8
0
12 Apr 2021
Deep Learning and Traffic Classification: Lessons learned from a commercial-grade dataset with hundreds of encrypted and zero-day applications
Lixuan Yang
A. Finamore
Feng Jun
Dario Rossi
22
46
0
07 Apr 2021
A matrix math facility for Power ISA(TM) processors
José Moreira
Kit Barton
Steven J. Battle
Peter Bergner
Ramon Bertran Monfort
...
Rajalakshmi Srinivasaraghavan
Shricharan Srivatsan
Brian W. Thompto
Andreas Wagner
Nelson Wu
6
14
0
07 Apr 2021
GPU Domain Specialization via Composable On-Package Architecture
Yaosheng Fu
Evgeny Bolotin
Niladrish Chatterjee
D. Nellans
S. Keckler
22
12
0
05 Apr 2021
Tight Compression: Compressing CNN Through Fine-Grained Pruning and Weight Permutation for Efficient Implementation
Xizi Chen
Jingyang Zhu
Jingbo Jiang
Chi-Ying Tsui
21
12
0
03 Apr 2021
Exploring Edge TPU for Network Intrusion Detection in IoT
Seyedehfaezeh Hosseininoorbin
S. Layeghy
Mohanad Sarhan
Raja Jurdak
Marius Portmann
14
22
0
30 Mar 2021
Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields
Jonathan T. Barron
B. Mildenhall
Matthew Tancik
Peter Hedman
Ricardo Martín Brualla
Pratul P. Srinivasan
16
1,917
0
24 Mar 2021
FastMoE: A Fast Mixture-of-Expert Training System
Jiaao He
J. Qiu
Aohan Zeng
Zhilin Yang
Jidong Zhai
Jie Tang
ALM
MoE
45
94
0
24 Mar 2021
Hardware Acceleration of Explainable Machine Learning using Tensor Processing Units
Zhixin Pan
Prabhat Mishra
35
18
0
22 Mar 2021
Extending Sparse Tensor Accelerators to Support Multiple Compression Formats
Eric Qin
Geonhwa Jeong
William Won
Sheng-Chun Kao
Hyoukjun Kwon
Sudarshan Srinivasan
Dipankar Das
G. Moon
S. Rajamanickam
T. Krishna
35
18
0
18 Mar 2021
Understanding the Design-Space of Sparse/Dense Multiphase GNN dataflows on Spatial Accelerators
Raveesh Garg
Eric Qin
Francisco Munoz-Martínez
Robert Guirado
Akshay Jain
...
José L. Abellán
M. Acacio
Eduard Alarcón
S. Rajamanickam
T. Krishna
GNN
14
18
0
14 Mar 2021
Revisiting ResNets: Improved Training and Scaling Strategies
Irwan Bello
W. Fedus
Xianzhi Du
E. D. Cubuk
A. Srinivas
Nayeon Lee
Jonathon Shlens
Barret Zoph
36
298
0
13 Mar 2021
The Old and the New: Can Physics-Informed Deep-Learning Replace Traditional Linear Solvers?
Stefano Markidis
PINN
48
184
0
12 Mar 2021
Performance of a Geometric Deep Learning Pipeline for HL-LHC Particle Tracking
X. Ju
D. Murnane
P. Calafiura
Nicholas Choma
S. Conlon
...
Aditi Chauhan
A. Schuy
Shih-Chieh Hsu
A. Ballow
A. Lazar
29
62
0
11 Mar 2021
Proof-of-Learning: Definitions and Practice
Hengrui Jia
Mohammad Yaghini
Christopher A. Choquette-Choo
Natalie Dullerud
Anvith Thudi
Varun Chandrasekaran
Nicolas Papernot
AAML
25
99
0
09 Mar 2021
F-CAD: A Framework to Explore Hardware Accelerators for Codec Avatar Decoding
Xiaofan Zhang
Dawei Wang
P. Chuang
Shugao Ma
Deming Chen
Yuecheng Li
VGen
14
10
0
08 Mar 2021
Reliability-Aware Quantization for Anti-Aging NPUs
Sami Salamin
Georgios Zervakis
Ourania Spantidi
Iraklis Anagnostopoulos
J. Henkel
H. Amrouch
19
13
0
08 Mar 2021
ShEF: Shielded Enclaves for Cloud FPGAs
Mark Zhao
Mingyu Gao
Christos Kozyrakis
41
55
0
05 Mar 2021
BM3D vs 2-Layer ONN
Junaid Malik
S. Kiranyaz
Mehmet Yamaç
Moncef Gabbouj
31
11
0
04 Mar 2021
Sparse Training Theory for Scalable and Efficient Agents
Decebal Constantin Mocanu
Elena Mocanu
T. Pinto
Selima Curci
Phuong H. Nguyen
M. Gibescu
D. Ernst
Z. Vale
47
17
0
02 Mar 2021
Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space Search
Kartik Hegde
Po-An Tsai
Sitao Huang
Vikas Chandra
A. Parashar
Christopher W. Fletcher
26
93
0
02 Mar 2021
Mitigating Edge Machine Learning Inference Bottlenecks: An Empirical Study on Accelerating Google Edge Models
Amirali Boroumand
Saugata Ghose
Berkin Akin
Ravi Narayanaswami
Geraldo F. Oliveira
Xiaoyu Ma
Eric Shiu
O. Mutlu
27
29
0
01 Mar 2021
Accelerating Recommendation System Training by Leveraging Popular Choices
Muhammad Adnan
Yassaman Ebrahimzadeh Maboud
Divyat Mahajan
Prashant J. Nair
30
56
0
01 Mar 2021
On the Utility of Gradient Compression in Distributed Training Systems
Saurabh Agarwal
Hongyi Wang
Shivaram Venkataraman
Dimitris Papailiopoulos
41
46
0
28 Feb 2021
Previous
1
2
3
...
10
11
12
...
22
23
24
Next