Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1704.04760
Cited By
In-Datacenter Performance Analysis of a Tensor Processing Unit
16 April 2017
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh Bhatia
Nan Boden
Al Borchers
Rick Boyle
Pierre-luc Cantin
Clifford Chao
Chris Clark
Jeremy Coriell
Mike Daley
Matt Dau
Jeffrey Dean
Ben Gelb
Taraneh Ghaemmaghami
Rajendra Gottipati
William Gulland
Robert Hagmann
C. Richard Ho
Doug Hogberg
John Hu
R. Hundt
Dan Hurt
Julian Ibarz
A. Jaffey
Alek Jaworski
Alexander Kaplan
Harshit Khaitan
Andy Koch
Naveen Kumar
Steve Lacy
James Laudon
James Law
Diemthu Le
Chris Leary
Zhuyuan Liu
Kyle Lucke
Alan Lundin
Gordon MacKean
Adriana Maggiore
Maire Mahony
Kieran Miller
R. Nagarajan
Ravi Narayanaswami
Ray Ni
Kathy Nix
Thomas Norrie
Mark Omernick
Narayana Penukonda
Andy Phelps
Jonathan Ross
Matt Ross
Amir Salek
Emad Samadiani
Chris Severn
Gregory Sizikov
Matthew Snelham
Jed Souter
Dan Steinberg
Andy Swing
Mercedes Tan
Gregory Thorson
Bo Tian
Horia Toma
Erick Tuttle
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
Re-assign community
ArXiv
PDF
HTML
Papers citing
"In-Datacenter Performance Analysis of a Tensor Processing Unit"
50 / 1,165 papers shown
Title
Understanding Data Storage and Ingestion for Large-Scale Deep Recommendation Model Training
Mark Zhao
Niket Agarwal
Aarti Basant
B. Gedik
Satadru Pan
...
Kevin Wilfong
Harsha Rastogi
Carole-Jean Wu
Christos Kozyrakis
Parikshit Pol
GNN
39
70
0
20 Aug 2021
Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory
W. Wan
Rajkumar Kubendran
Clemens J. S. Schaefer
S. Eryilmaz
Wenqiang Zhang
...
B. Gao
Siddharth Joshi
Huaqiang Wu
H. P. Wong
Gert Cauwenberghs
21
12
0
17 Aug 2021
AIRCHITECT: Learning Custom Architecture Design and Mapping Space
A. Samajdar
J. Joseph
Matthew Denton
T. Krishna
33
7
0
16 Aug 2021
LayerPipe: Accelerating Deep Neural Network Training by Intra-Layer and Inter-Layer Gradient Pipelining and Multiprocessor Scheduling
Nanda K. Unnikrishnan
Keshab K. Parhi
AI4CE
22
5
0
14 Aug 2021
A Distributed SGD Algorithm with Global Sketching for Deep Learning Training Acceleration
Lingfei Dai
Boyu Diao
Chao Li
Yongjun Xu
41
5
0
13 Aug 2021
From Domain-Specific Languages to Memory-Optimized Accelerators for Fluid Dynamics
Karl F. A. Friebel
Stephanie Soldavini
G. Hempel
C. Pilato
J. Castrillón
40
9
0
06 Aug 2021
BEANNA: A Binary-Enabled Architecture for Neural Network Acceleration
C. Terrill
Fred Chu
MQ
19
0
0
04 Aug 2021
Large-Scale Differentially Private BERT
Rohan Anil
Badih Ghazi
Vineet Gupta
Ravi Kumar
Pasin Manurangsi
41
132
0
03 Aug 2021
Data Streaming and Traffic Gathering in Mesh-based NoC for Deep Neural Network Acceleration
Binayak Tiwari
Mei Yang
Xiaohang Wang
Yingtao Jiang
GNN
4
2
0
01 Aug 2021
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Andrew Jaegle
Sebastian Borgeaud
Jean-Baptiste Alayrac
Carl Doersch
Catalin Ionescu
...
Olivier J. Hénaff
M. Botvinick
Andrew Zisserman
Oriol Vinyals
João Carreira
MLLM
VLM
GNN
25
567
0
30 Jul 2021
Towards Efficient Tensor Decomposition-Based DNN Model Compression with Optimization Framework
Miao Yin
Yang Sui
Siyu Liao
Bo Yuan
28
79
0
26 Jul 2021
CREW: Computation Reuse and Efficient Weight Storage for Hardware-accelerated MLPs and RNNs
Marc Riera
J. Arnau
Antonio González
29
5
0
20 Jul 2021
Positive/Negative Approximate Multipliers for DNN Accelerators
Ourania Spantidi
Georgios Zervakis
Iraklis Anagnostopoulos
H. Amrouch
J. Henkel
26
18
0
20 Jul 2021
AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning
Young Geun Kim
Carole-Jean Wu
26
85
0
16 Jul 2021
S2TA: Exploiting Structured Sparsity for Energy-Efficient Mobile CNN Acceleration
Zhi-Gang Liu
P. Whatmough
Yuhao Zhu
Matthew Mattina
MQ
35
75
0
16 Jul 2021
FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks
Sheng-Chun Kao
Suvinay Subramanian
Gaurav Agrawal
Amir Yazdanbakhsh
T. Krishna
51
58
0
13 Jul 2021
ROBIN: A Robust Optical Binary Neural Network Accelerator
Febin P. Sunny
Asif Mirza
Mahdi Nikdast
S. Pasricha
MQ
33
35
0
12 Jul 2021
Structured Model Pruning of Convolutional Networks on Tensor Processing Units
Kongtao Chen
Ken Franko
Ruoxin Sang
CVBM
11
59
0
09 Jul 2021
Tensor Methods in Computer Vision and Deep Learning
Yannis Panagakis
Jean Kossaifi
Grigorios G. Chrysos
James Oldfield
M. Nicolaou
Anima Anandkumar
Stefanos Zafeiriou
38
120
0
07 Jul 2021
Simple Training Strategies and Model Scaling for Object Detection
Xianzhi Du
Barret Zoph
Wei-Chih Hung
Nayeon Lee
ObjD
33
40
0
30 Jun 2021
Multimodal Few-Shot Learning with Frozen Language Models
Maria Tsimpoukelli
Jacob Menick
Serkan Cabi
S. M. Ali Eslami
Oriol Vinyals
Felix Hill
MLLM
96
755
0
25 Jun 2021
A Construction Kit for Efficient Low Power Neural Network Accelerator Designs
Petar Jokic
E. Azarkhish
Andrea Bonetti
M. Pons
S. Emery
Luca Benini
26
3
0
24 Jun 2021
APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores
Boyuan Feng
Yuke Wang
Tong Geng
Ang Li
Yufei Ding
MQ
23
37
0
23 Jun 2021
NAX: Co-Designing Neural Network and Hardware Architecture for Memristive Xbar based Computing Systems
Shubham Negi
I. Chakraborty
Aayush Ankit
Kaushik Roy
25
6
0
23 Jun 2021
Randomness In Neural Network Training: Characterizing The Impact of Tooling
Donglin Zhuang
Xingyao Zhang
Shuaiwen Leon Song
Sara Hooker
27
75
0
22 Jun 2021
Boggart: Towards General-Purpose Acceleration of Retrospective Video Analytics
Neil Agarwal
Ravi Netravali
32
14
0
21 Jun 2021
Dive into Deep Learning
Aston Zhang
Zachary Chase Lipton
Mu Li
Alexander J. Smola
VLM
30
561
0
21 Jun 2021
How to Reach Real-Time AI on Consumer Devices? Solutions for Programmable and Custom Architectures
Stylianos I. Venieris
Ioannis Panopoulos
Ilias Leontiadis
I. Venieris
38
6
0
21 Jun 2021
Multiplying Matrices Without Multiplying
Davis W. Blalock
John Guttag
27
51
0
21 Jun 2021
A Survey on Serverless Computing
J. John
Shashank Gupta
25
59
0
20 Jun 2021
Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication
G. Moon
Hyoukjun Kwon
Geonhwa Jeong
Prasanth Chatarasi
S. Rajamanickam
T. Krishna
6
28
0
19 Jun 2021
FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator
Geng Yuan
Payman Behnam
Zhengang Li
Ali Shafiee
Sheng Lin
...
Hang Liu
Xuehai Qian
M. N. Bojnordi
Yanzhi Wang
Caiwen Ding
24
68
0
16 Jun 2021
Automatic Construction of Evaluation Suites for Natural Language Generation Datasets
Simon Mille
Kaustubh D. Dhole
Saad Mahamood
Laura Perez-Beltrachini
Varun Gangal
Mihir Kale
Emiel van Miltenburg
Sebastian Gehrmann
ELM
52
22
0
16 Jun 2021
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
Gaurav Menghani
VLM
MedIm
23
367
0
16 Jun 2021
Scene Transformer: A unified architecture for predicting multiple agent trajectories
Jiquan Ngiam
Benjamin Caine
Vijay Vasudevan
Zhengdong Zhang
H. Chiang
...
Ashish Venugopal
David J. Weiss
Benjamin Sapp
Zhifeng Chen
Jonathon Shlens
22
157
0
15 Jun 2021
ShortcutFusion: From Tensorflow to FPGA-based accelerator with reuse-aware memory allocation for shortcut data
Duy-Thanh Nguyen
Hyeonseung Je
Tuan Nghia Nguyen
Soojung Ryu
Kyujoong Lee
Hyuk-Jae Lee
24
26
0
15 Jun 2021
S2Engine: A Novel Systolic Architecture for Sparse Convolutional Neural Networks
Jianlei Yang
Wenzhi Fu
Xingzhou Cheng
Xucheng Ye
Pengcheng Dai
Weisheng Zhao
16
8
0
15 Jun 2021
Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators
Yonggan Fu
Yongan Zhang
Yang Zhang
David D. Cox
Yingyan Lin
MQ
64
18
0
11 Jun 2021
Knowledge distillation: A good teacher is patient and consistent
Lucas Beyer
Xiaohua Zhai
Amelie Royer
L. Markeeva
Rohan Anil
Alexander Kolesnikov
VLM
52
287
0
09 Jun 2021
HASI: Hardware-Accelerated Stochastic Inference, A Defense Against Adversarial Machine Learning Attacks
Mohammad Hossein Samavatian
Saikat Majumdar
Kristin Barber
R. Teodorescu
AAML
31
4
0
09 Jun 2021
Dynamic Sparse Training for Deep Reinforcement Learning
Ghada Sokar
Elena Mocanu
Decebal Constantin Mocanu
Mykola Pechenizkiy
Peter Stone
23
52
0
08 Jun 2021
Top-KAST: Top-K Always Sparse Training
Siddhant M. Jayakumar
Razvan Pascanu
Jack W. Rae
Simon Osindero
Erich Elsen
14
97
0
07 Jun 2021
JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu
Hao Liu
Qian Gao
Jiang Li
X. Liao
Hao Xiong
...
Guobao Yang
Zhiwei Zha
Daxiang Dong
Dejing Dou
Haoyi Xiong
VLM
30
22
0
03 Jun 2021
Machine Learning for Security in Vehicular Networks: A Comprehensive Survey
Anum Talpur
M. Gurusamy
17
61
0
31 May 2021
Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with Rank Reordering
Liang Luo
Jacob Nelson
Arvind Krishnamurthy
Luis Ceze
127
1
0
28 May 2021
Bridging Data Center AI Systems with Edge Computing for Actionable Information Retrieval
Zhengchun Liu
Ahsan Ali
Peter Kenesei
Antonino Miceli
Hemant Sharma
...
Naoufal Layad
Jana Thayer
R. Herbst
Chun Hong Yoon
Ian Foster
27
22
0
28 May 2021
FuSeConv: Fully Separable Convolutions for Fast Inference on Systolic Arrays
Surya Selvam
Vinod Ganesan
Pratyush Kumar
38
10
0
27 May 2021
Towards Efficient Full 8-bit Integer DNN Online Training on Resource-limited Devices without Batch Normalization
Yukuan Yang
Xiaowei Chi
Lei Deng
Tianyi Yan
Feng Gao
Guoqi Li
MQ
28
6
0
27 May 2021
PSRR-MaxpoolNMS: Pyramid Shifted MaxpoolNMS with Relationship Recovery
Tianyi Zhang
Jie Lin
Peng Hu
Bin Zhao
M. Aly
13
3
0
27 May 2021
CARLS: Cross-platform Asynchronous Representation Learning System
Chun-Ta Lu
Yun Zeng
Da-Cheng Juan
Yicheng Fan
Zhe Li
...
Ariel Fuxman
Futang Peng
Zhen Li
Tom Duerig
Andrew Tomkins
10
0
0
26 May 2021
Previous
1
2
3
...
9
10
11
...
22
23
24
Next