ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.04760
  4. Cited By
In-Datacenter Performance Analysis of a Tensor Processing Unit

In-Datacenter Performance Analysis of a Tensor Processing Unit

16 April 2017
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh Bhatia
Nan Boden
Al Borchers
Rick Boyle
Pierre-luc Cantin
Clifford Chao
Chris Clark
Jeremy Coriell
Mike Daley
Matt Dau
Jeffrey Dean
Ben Gelb
Taraneh Ghaemmaghami
Rajendra Gottipati
William Gulland
Robert Hagmann
C. Richard Ho
Doug Hogberg
John Hu
R. Hundt
Dan Hurt
Julian Ibarz
A. Jaffey
Alek Jaworski
Alexander Kaplan
Harshit Khaitan
Andy Koch
Naveen Kumar
Steve Lacy
James Laudon
James Law
Diemthu Le
Chris Leary
Zhuyuan Liu
Kyle Lucke
Alan Lundin
Gordon MacKean
Adriana Maggiore
Maire Mahony
Kieran Miller
R. Nagarajan
Ravi Narayanaswami
Ray Ni
Kathy Nix
Thomas Norrie
Mark Omernick
Narayana Penukonda
Andy Phelps
Jonathan Ross
Matt Ross
Amir Salek
Emad Samadiani
Chris Severn
Gregory Sizikov
Matthew Snelham
Jed Souter
Dan Steinberg
Andy Swing
Mercedes Tan
Gregory Thorson
Bo Tian
Horia Toma
Erick Tuttle
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
ArXiv (abs)PDFHTML

Papers citing "In-Datacenter Performance Analysis of a Tensor Processing Unit"

50 / 1,167 papers shown
Title
AutoDNNchip: An Automated DNN Chip Predictor and Builder for Both FPGAs
  and ASICs
AutoDNNchip: An Automated DNN Chip Predictor and Builder for Both FPGAs and ASICs
Pengfei Xu
Xiaofan Zhang
Cong Hao
Yang Zhao
Yongan Zhang
Yue Wang
Chaojian Li
Zetong Guan
Deming Chen
Yingyan Lin
109
91
0
06 Jan 2020
Deep Representation Learning in Speech Processing: Challenges, Recent
  Advances, and Future Trends
Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends
S. Latif
R. Rana
Sara Khalifa
Raja Jurdak
Junaid Qadir
Björn W. Schuller
AI4TS
96
82
0
02 Jan 2020
RecNMP: Accelerating Personalized Recommendation with Near-Memory
  Processing
RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing
Liu Ke
Udit Gupta
Carole-Jean Wu
B. Cho
Mark Hempstead
...
Dheevatsa Mudigere
Maxim Naumov
Martin D. Schatz
M. Smelyanskiy
Xiaodong Wang
80
225
0
30 Dec 2019
Towards Unified INT8 Training for Convolutional Neural Network
Towards Unified INT8 Training for Convolutional Neural Network
Feng Zhu
Ruihao Gong
F. Yu
Xianglong Liu
Yanfei Wang
Zhelong Li
Xiuqi Yang
Junjie Yan
MQ
97
152
0
29 Dec 2019
PANTHER: A Programmable Architecture for Neural Network Training
  Harnessing Energy-efficient ReRAM
PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-efficient ReRAM
Aayush Ankit
I. E. Hajj
S. R. Chalamalasetti
S. Agarwal
M. Marinella
M. Foltin
J. Strachan
D. Milojicic
Wen-mei W. Hwu
Kaushik Roy
70
68
0
24 Dec 2019
Big Transfer (BiT): General Visual Representation Learning
Big Transfer (BiT): General Visual Representation Learning
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
J. Puigcerver
Jessica Yung
Sylvain Gelly
N. Houlsby
MQ
310
1,211
0
24 Dec 2019
A Survey on Distributed Machine Learning
A Survey on Distributed Machine Learning
Joost Verbraeken
Matthijs Wolting
Jonathan Katzy
Jeroen Kloppenburg
Tim Verbelen
Jan S. Rellermeyer
OOD
122
715
0
20 Dec 2019
FQ-Conv: Fully Quantized Convolution for Efficient and Accurate
  Inference
FQ-Conv: Fully Quantized Convolution for Efficient and Accurate Inference
Bram-Ernst Verhoef
Nathan Laubeuf
S. Cosemans
P. Debacker
Ioannis A. Papistas
A. Mallik
D. Verkest
MQ
65
16
0
19 Dec 2019
Spiking Networks for Improved Cognitive Abilities of Edge Computing
  Devices
Spiking Networks for Improved Cognitive Abilities of Edge Computing Devices
Anton Akusok
Kaj-Mikael Björk
L. E. Leal
Y. Miché
Renjie Hu
A. Lendasse
39
4
0
19 Dec 2019
A flexible FPGA accelerator for convolutional neural networks
A flexible FPGA accelerator for convolutional neural networks
Kingshuk Majumder
Uday Bondhugula
57
5
0
16 Dec 2019
High-resolution imaging on TPUs
High-resolution imaging on TPUs
F. Huot
Yi-Fan Chen
R. Clapp
C. Boneti
John R. Anderson
44
13
0
13 Dec 2019
SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning
  Workloads
SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning Workloads
S. Xi
Yuan Yao
K. Bhardwaj
P. Whatmough
Gu-Yeon Wei
David Brooks
41
32
0
10 Dec 2019
Machine Unlearning
Machine Unlearning
Lucas Bourtoule
Varun Chandrasekaran
Christopher A. Choquette-Choo
Hengrui Jia
Adelin Travers
Baiwu Zhang
David Lie
Nicolas Papernot
MU
197
888
0
09 Dec 2019
VALAN: Vision and Language Agent Navigation
VALAN: Vision and Language Agent Navigation
L. Lansing
Vihan Jain
Harsh Mehta
Haoshuo Huang
Eugene Ie
LM&RoAI4TS
58
8
0
06 Dec 2019
A Multigrid Method for Efficiently Training Video Models
A Multigrid Method for Efficiently Training Video Models
Chaoxia Wu
Ross B. Girshick
Kaiming He
Christoph Feichtenhofer
Philipp Krahenbuhl
91
94
0
02 Dec 2019
FT-ClipAct: Resilience Analysis of Deep Neural Networks and Improving
  their Fault Tolerance using Clipped Activation
FT-ClipAct: Resilience Analysis of Deep Neural Networks and Improving their Fault Tolerance using Clipped Activation
L. Hoang
Muhammad Abdullah Hanif
Mohamed Bennai
AI4CE
66
116
0
02 Dec 2019
Intelligent Resource Scheduling for Co-located Latency-critical
  Services: A Multi-Model Collaborative Learning Approach
Intelligent Resource Scheduling for Co-located Latency-critical Services: A Multi-Model Collaborative Learning Approach
Lei Liu
VLM
24
10
0
26 Nov 2019
Structured Multi-Hashing for Model Compression
Structured Multi-Hashing for Model Compression
Elad Eban
Yair Movshovitz-Attias
Hao Wu
Mark Sandler
Andrew Poon
Yerlan Idelbayev
M. A. Carreira-Perpiñán
78
18
0
25 Nov 2019
Machine Learning based detection of multiple Wi-Fi BSSs for LTE-U CSAT
Machine Learning based detection of multiple Wi-Fi BSSs for LTE-U CSAT
V. Sathya
Adam Dziedzic
M. Ghosh
S. Krishnan
18
4
0
21 Nov 2019
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the Eyes
Yu Liu
Xuhui Jia
Mingxing Tan
Raviteja Vemulapalli
Yukun Zhu
Bradley Green
Xiaogang Wang
109
68
0
20 Nov 2019
Selective sampling for accelerating training of deep neural networks
Selective sampling for accelerating training of deep neural networks
Berry Weinstein
Shai Fine
Y. Hel-Or
23
3
0
16 Nov 2019
NeuMMU: Architectural Support for Efficient Address Translations in
  Neural Processing Units
NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units
Bongjoon Hyun
Youngeun Kwon
Yujeong Choi
John Kim
Minsoo Rhu
67
29
0
15 Nov 2019
ASV: Accelerated Stereo Vision System
ASV: Accelerated Stereo Vision System
Yu Feng
P. Whatmough
Yuhao Zhu
54
36
0
15 Nov 2019
The Deep Learning Revolution and Its Implications for Computer
  Architecture and Chip Design
The Deep Learning Revolution and Its Implications for Computer Architecture and Chip Design
J. Dean
55
79
0
13 Nov 2019
Learning from Data-Rich Problems: A Case Study on Genetic Variant
  Calling
Learning from Data-Rich Problems: A Case Study on Genetic Variant Calling
Ren Yi
Pi-Chuan Chang
Gunjan Baid
Andrew Carroll
30
2
0
12 Nov 2019
RAPDARTS: Resource-Aware Progressive Differentiable Architecture Search
RAPDARTS: Resource-Aware Progressive Differentiable Architecture Search
Sam Green
C. Vineyard
Ryan L. Helinski
Ç. Koç
39
3
0
08 Nov 2019
The Pitfall of Evaluating Performance on Emerging AI Accelerators
The Pitfall of Evaluating Performance on Emerging AI Accelerators
Zihan Jiang
Jiansong Li
Jiangfeng Zhan
47
2
0
08 Nov 2019
MERIT: Tensor Transform for Memory-Efficient Vision Processing on
  Parallel Architectures
MERIT: Tensor Transform for Memory-Efficient Vision Processing on Parallel Architectures
Yu-Sheng Lin
Wei-Chao Chen
Shao-Yi Chien
42
5
0
07 Nov 2019
Boosting LSTM Performance Through Dynamic Precision Selection
Boosting LSTM Performance Through Dynamic Precision Selection
Franyell Silfa
J. Arnau
Antonio González
MQ
25
5
0
07 Nov 2019
SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural
  Network
SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Network
R. Yazdani
Olatunji Ruwase
Minjia Zhang
Yuxiong He
J. Arnau
Antonio González
53
5
0
04 Nov 2019
Ternary MobileNets via Per-Layer Hybrid Filter Banks
Ternary MobileNets via Per-Layer Hybrid Filter Banks
Dibakar Gope
Jesse G. Beu
Urmish Thakker
Matthew Mattina
MQ
69
15
0
04 Nov 2019
Comprehensive SNN Compression Using ADMM Optimization and Activity
  Regularization
Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization
Lei Deng
Yujie Wu
Yifan Hu
Ling Liang
Guoqi Li
Xing Hu
Yufei Ding
Peng Li
Yuan Xie
78
85
0
03 Nov 2019
Progressive Compressed Records: Taking a Byte out of Deep Learning Data
Progressive Compressed Records: Taking a Byte out of Deep Learning Data
Michael Kuchnik
George Amvrosiadis
Virginia Smith
86
9
0
01 Nov 2019
ALERT: Accurate Learning for Energy and Timeliness
ALERT: Accurate Learning for Energy and Timeliness
Chengcheng Wan
M. Santriaji
E. Rogers
H. Hoffmann
Michael Maire
Shan Lu
AI4CE
94
42
0
31 Oct 2019
Training DNN IoT Applications for Deployment On Analog NVM Crossbars
Training DNN IoT Applications for Deployment On Analog NVM Crossbars
F. García-Redondo
Shidhartha Das
G. Rosendale
47
5
0
30 Oct 2019
Recognizing long-form speech using streaming end-to-end models
Recognizing long-form speech using streaming end-to-end models
A. Narayanan
Rohit Prabhavalkar
Chung-Cheng Chiu
David Rybach
Tara N. Sainath
Trevor Strohman
76
130
0
24 Oct 2019
LUTNet: Learning FPGA Configurations for Highly Efficient Neural Network
  Inference
LUTNet: Learning FPGA Configurations for Highly Efficient Neural Network Inference
Erwei Wang
James J. Davis
P. Cheung
George A. Constantinides
MQ
54
44
0
24 Oct 2019
ELSA: A Throughput-Optimized Design of an LSTM Accelerator for
  Energy-Constrained Devices
ELSA: A Throughput-Optimized Design of an LSTM Accelerator for Energy-Constrained Devices
E. Azari
S. Vrudhula
67
5
0
19 Oct 2019
SPEC2: SPECtral SParsE CNN Accelerator on FPGAs
SPEC2: SPECtral SParsE CNN Accelerator on FPGAs
Yue Niu
Hanqing Zeng
Ajitesh Srivastava
Kartik Lakhotia
Rajgopal Kannan
Yanzhi Wang
Viktor Prasanna
MQ
44
8
0
16 Oct 2019
OverQ: Opportunistic Outlier Quantization for Neural Network
  Accelerators
OverQ: Opportunistic Outlier Quantization for Neural Network Accelerators
Ritchie Zhao
Jordan Dotzel
Zhanqiu Hu
Preslav Ivanov
Christopher De Sa
Zhiru Zhang
MQ
34
1
0
13 Oct 2019
eCNN: A Block-Based and Highly-Parallel CNN Accelerator for Edge
  Inference
eCNN: A Block-Based and Highly-Parallel CNN Accelerator for Edge Inference
Chao-Tsung Huang
Yu-Chun Ding
Huan-Ching Wang
Chi-Wen Weng
Kai-Ping Lin
Li-Wei Wang
Li-De Chen
69
44
0
13 Oct 2019
EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network
  Inference Using Approximate DRAM
EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM
Skanda Koppula
Lois Orosa
A. G. Yaglikçi
Roknoddin Azizi
Taha Shahroodi
Konstantinos Kanellopoulos
O. Mutlu
80
108
0
12 Oct 2019
Inundation Modeling in Data Scarce Regions
Inundation Modeling in Data Scarce Regions
Z. Ben-Haim
V. Anisimov
Aaron Yonas
Varun Gulshan
Yusef Shafi
Stephan Hoyer
Sella Nevo
AI4CE
62
12
0
11 Oct 2019
PipeMare: Asynchronous Pipeline Parallel DNN Training
PipeMare: Asynchronous Pipeline Parallel DNN Training
Bowen Yang
Jian Zhang
Jonathan Li
Christopher Ré
Christopher R. Aberger
Christopher De Sa
77
113
0
09 Oct 2019
Synthesizing Credit Card Transactions
Synthesizing Credit Card Transactions
E. Altman
65
35
0
04 Oct 2019
Training Multiscale-CNN for Large Microscopy Image Classification in One
  Hour
Training Multiscale-CNN for Large Microscopy Image Classification in One Hour
Kushal Datta
Imtiaz Hossain
Sun Choi
V. Saletore
Kyle H. Ambert
William J. Godinez
Xian Zhang
19
4
0
03 Oct 2019
Accelerating Data Loading in Deep Neural Network Training
Accelerating Data Loading in Deep Neural Network Training
Chih-Chieh Yang
Guojing Cong
78
38
0
02 Oct 2019
MLPerf Training Benchmark
MLPerf Training Benchmark
Arya D. McCarthy
Christine Cheng
Cody Coleman
Greg Diamos
Paulius Micikevicius
...
Carole-Jean Wu
Lingjie Xu
Masafumi Yamazaki
C. Young
Matei A. Zaharia
113
315
0
02 Oct 2019
QuaRL: Quantization for Fast and Environmentally Sustainable
  Reinforcement Learning
QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning
Srivatsan Krishnan
Maximilian Lam
Sharad Chitlangia
Zishen Wan
Gabriel Barth-Maron
Aleksandra Faust
Vijay Janapa Reddi
MQ
44
26
0
02 Oct 2019
Accelerating Deep Learning by Focusing on the Biggest Losers
Accelerating Deep Learning by Focusing on the Biggest Losers
Angela H. Jiang
Daniel L.-K. Wong
Giulio Zhou
D. Andersen
J. Dean
...
Gauri Joshi
M. Kaminsky
M. Kozuch
Zachary Chase Lipton
Padmanabhan Pillai
93
123
0
02 Oct 2019
Previous
123...161718...222324
Next