ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.04760
  4. Cited By
In-Datacenter Performance Analysis of a Tensor Processing Unit

In-Datacenter Performance Analysis of a Tensor Processing Unit

16 April 2017
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh Bhatia
Nan Boden
Al Borchers
Rick Boyle
Pierre-luc Cantin
Clifford Chao
Chris Clark
Jeremy Coriell
Mike Daley
Matt Dau
Jeffrey Dean
Ben Gelb
Taraneh Ghaemmaghami
Rajendra Gottipati
William Gulland
Robert Hagmann
C. Richard Ho
Doug Hogberg
John Hu
R. Hundt
Dan Hurt
Julian Ibarz
A. Jaffey
Alek Jaworski
Alexander Kaplan
Harshit Khaitan
Andy Koch
Naveen Kumar
Steve Lacy
James Laudon
James Law
Diemthu Le
Chris Leary
Zhuyuan Liu
Kyle Lucke
Alan Lundin
Gordon MacKean
Adriana Maggiore
Maire Mahony
Kieran Miller
R. Nagarajan
Ravi Narayanaswami
Ray Ni
Kathy Nix
Thomas Norrie
Mark Omernick
Narayana Penukonda
Andy Phelps
Jonathan Ross
Matt Ross
Amir Salek
Emad Samadiani
Chris Severn
Gregory Sizikov
Matthew Snelham
Jed Souter
Dan Steinberg
Andy Swing
Mercedes Tan
Gregory Thorson
Bo Tian
Horia Toma
Erick Tuttle
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
ArXivPDFHTML

Papers citing "In-Datacenter Performance Analysis of a Tensor Processing Unit"

50 / 1,165 papers shown
Title
Towards Unified INT8 Training for Convolutional Neural Network
Towards Unified INT8 Training for Convolutional Neural Network
Feng Zhu
Ruihao Gong
F. Yu
Xianglong Liu
Yanfei Wang
Zhelong Li
Xiuqi Yang
Junjie Yan
MQ
40
151
0
29 Dec 2019
PANTHER: A Programmable Architecture for Neural Network Training
  Harnessing Energy-efficient ReRAM
PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-efficient ReRAM
Aayush Ankit
I. E. Hajj
S. R. Chalamalasetti
S. Agarwal
M. Marinella
M. Foltin
J. Strachan
D. Milojicic
Wen-mei W. Hwu
Kaushik Roy
21
65
0
24 Dec 2019
Big Transfer (BiT): General Visual Representation Learning
Big Transfer (BiT): General Visual Representation Learning
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
J. Puigcerver
Jessica Yung
Sylvain Gelly
N. Houlsby
MQ
114
1,183
0
24 Dec 2019
A Survey on Distributed Machine Learning
A Survey on Distributed Machine Learning
Joost Verbraeken
Matthijs Wolting
Jonathan Katzy
Jeroen Kloppenburg
Tim Verbelen
Jan S. Rellermeyer
OOD
42
692
0
20 Dec 2019
FQ-Conv: Fully Quantized Convolution for Efficient and Accurate
  Inference
FQ-Conv: Fully Quantized Convolution for Efficient and Accurate Inference
Bram-Ernst Verhoef
Nathan Laubeuf
S. Cosemans
P. Debacker
Ioannis A. Papistas
A. Mallik
D. Verkest
MQ
19
16
0
19 Dec 2019
Spiking Networks for Improved Cognitive Abilities of Edge Computing
  Devices
Spiking Networks for Improved Cognitive Abilities of Edge Computing Devices
Anton Akusok
Kaj-Mikael Björk
L. E. Leal
Y. Miché
Renjie Hu
A. Lendasse
4
4
0
19 Dec 2019
A flexible FPGA accelerator for convolutional neural networks
A flexible FPGA accelerator for convolutional neural networks
Kingshuk Majumder
Uday Bondhugula
28
5
0
16 Dec 2019
High-resolution imaging on TPUs
High-resolution imaging on TPUs
F. Huot
Yi-Fan Chen
R. Clapp
C. Boneti
John R. Anderson
22
13
0
13 Dec 2019
SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning
  Workloads
SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning Workloads
S. Xi
Yuan Yao
K. Bhardwaj
P. Whatmough
Gu-Yeon Wei
David Brooks
14
30
0
10 Dec 2019
Machine Unlearning
Machine Unlearning
Lucas Bourtoule
Varun Chandrasekaran
Christopher A. Choquette-Choo
Hengrui Jia
Adelin Travers
Baiwu Zhang
David Lie
Nicolas Papernot
MU
65
818
0
09 Dec 2019
VALAN: Vision and Language Agent Navigation
VALAN: Vision and Language Agent Navigation
L. Lansing
Vihan Jain
Harsh Mehta
Haoshuo Huang
Eugene Ie
LM&Ro
AI4TS
11
8
0
06 Dec 2019
A Multigrid Method for Efficiently Training Video Models
A Multigrid Method for Efficiently Training Video Models
Chaoxia Wu
Ross B. Girshick
Kaiming He
Christoph Feichtenhofer
Philipp Krahenbuhl
32
94
0
02 Dec 2019
FT-ClipAct: Resilience Analysis of Deep Neural Networks and Improving
  their Fault Tolerance using Clipped Activation
FT-ClipAct: Resilience Analysis of Deep Neural Networks and Improving their Fault Tolerance using Clipped Activation
L. Hoang
Muhammad Abdullah Hanif
Mohamed Bennai
AI4CE
14
114
0
02 Dec 2019
Intelligent Resource Scheduling for Co-located Latency-critical
  Services: A Multi-Model Collaborative Learning Approach
Intelligent Resource Scheduling for Co-located Latency-critical Services: A Multi-Model Collaborative Learning Approach
Lei Liu
VLM
17
10
0
26 Nov 2019
Structured Multi-Hashing for Model Compression
Structured Multi-Hashing for Model Compression
Elad Eban
Yair Movshovitz-Attias
Hao Wu
Mark Sandler
Andrew Poon
Yerlan Idelbayev
M. A. Carreira-Perpiñán
17
18
0
25 Nov 2019
Machine Learning based detection of multiple Wi-Fi BSSs for LTE-U CSAT
Machine Learning based detection of multiple Wi-Fi BSSs for LTE-U CSAT
V. Sathya
Adam Dziedzic
M. Ghosh
S. Krishnan
9
4
0
21 Nov 2019
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the Eyes
Yu Liu
Xuhui Jia
Mingxing Tan
Raviteja Vemulapalli
Yukun Zhu
Bradley Green
Xiaogang Wang
30
68
0
20 Nov 2019
Selective sampling for accelerating training of deep neural networks
Selective sampling for accelerating training of deep neural networks
Berry Weinstein
Shai Fine
Y. Hel-Or
16
3
0
16 Nov 2019
NeuMMU: Architectural Support for Efficient Address Translations in
  Neural Processing Units
NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units
Bongjoon Hyun
Youngeun Kwon
Yujeong Choi
John Kim
Minsoo Rhu
17
28
0
15 Nov 2019
ASV: Accelerated Stereo Vision System
ASV: Accelerated Stereo Vision System
Yu Feng
P. Whatmough
Yuhao Zhu
19
34
0
15 Nov 2019
The Deep Learning Revolution and Its Implications for Computer
  Architecture and Chip Design
The Deep Learning Revolution and Its Implications for Computer Architecture and Chip Design
J. Dean
22
79
0
13 Nov 2019
Learning from Data-Rich Problems: A Case Study on Genetic Variant
  Calling
Learning from Data-Rich Problems: A Case Study on Genetic Variant Calling
Ren Yi
Pi-Chuan Chang
Gunjan Baid
Andrew Carroll
28
2
0
12 Nov 2019
RAPDARTS: Resource-Aware Progressive Differentiable Architecture Search
RAPDARTS: Resource-Aware Progressive Differentiable Architecture Search
Sam Green
C. Vineyard
Ryan L. Helinski
Ç. Koç
22
3
0
08 Nov 2019
The Pitfall of Evaluating Performance on Emerging AI Accelerators
The Pitfall of Evaluating Performance on Emerging AI Accelerators
Zihan Jiang
Jiansong Li
Jiangfeng Zhan
19
2
0
08 Nov 2019
MERIT: Tensor Transform for Memory-Efficient Vision Processing on
  Parallel Architectures
MERIT: Tensor Transform for Memory-Efficient Vision Processing on Parallel Architectures
Yu-Sheng Lin
Wei-Chao Chen
Shao-Yi Chien
38
4
0
07 Nov 2019
Boosting LSTM Performance Through Dynamic Precision Selection
Boosting LSTM Performance Through Dynamic Precision Selection
Franyell Silfa
J. Arnau
Antonio González
MQ
21
5
0
07 Nov 2019
SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural
  Network
SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Network
R. Yazdani
Olatunji Ruwase
Minjia Zhang
Yuxiong He
J. Arnau
Antonio González
38
4
0
04 Nov 2019
Ternary MobileNets via Per-Layer Hybrid Filter Banks
Ternary MobileNets via Per-Layer Hybrid Filter Banks
Dibakar Gope
Jesse G. Beu
Urmish Thakker
Matthew Mattina
MQ
32
15
0
04 Nov 2019
Comprehensive SNN Compression Using ADMM Optimization and Activity
  Regularization
Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization
Lei Deng
Yujie Wu
Yifan Hu
Ling Liang
Guoqi Li
Xing Hu
Yufei Ding
Peng Li
Yuan Xie
39
82
0
03 Nov 2019
Progressive Compressed Records: Taking a Byte out of Deep Learning Data
Progressive Compressed Records: Taking a Byte out of Deep Learning Data
Michael Kuchnik
George Amvrosiadis
Virginia Smith
19
9
0
01 Nov 2019
ALERT: Accurate Learning for Energy and Timeliness
ALERT: Accurate Learning for Energy and Timeliness
Chengcheng Wan
M. Santriaji
E. Rogers
H. Hoffmann
Michael Maire
Shan Lu
AI4CE
48
40
0
31 Oct 2019
Training DNN IoT Applications for Deployment On Analog NVM Crossbars
Training DNN IoT Applications for Deployment On Analog NVM Crossbars
F. García-Redondo
Shidhartha Das
G. Rosendale
19
5
0
30 Oct 2019
Recognizing long-form speech using streaming end-to-end models
Recognizing long-form speech using streaming end-to-end models
A. Narayanan
Rohit Prabhavalkar
Chung-Cheng Chiu
David Rybach
Tara N. Sainath
Trevor Strohman
29
129
0
24 Oct 2019
LUTNet: Learning FPGA Configurations for Highly Efficient Neural Network
  Inference
LUTNet: Learning FPGA Configurations for Highly Efficient Neural Network Inference
Erwei Wang
James J. Davis
P. Cheung
George A. Constantinides
MQ
9
41
0
24 Oct 2019
ELSA: A Throughput-Optimized Design of an LSTM Accelerator for
  Energy-Constrained Devices
ELSA: A Throughput-Optimized Design of an LSTM Accelerator for Energy-Constrained Devices
E. Azari
S. Vrudhula
20
5
0
19 Oct 2019
SPEC2: SPECtral SParsE CNN Accelerator on FPGAs
SPEC2: SPECtral SParsE CNN Accelerator on FPGAs
Yue Niu
Hanqing Zeng
Ajitesh Srivastava
Kartik Lakhotia
Rajgopal Kannan
Yanzhi Wang
Viktor Prasanna
MQ
21
8
0
16 Oct 2019
OverQ: Opportunistic Outlier Quantization for Neural Network
  Accelerators
OverQ: Opportunistic Outlier Quantization for Neural Network Accelerators
Ritchie Zhao
Jordan Dotzel
Zhanqiu Hu
Preslav Ivanov
Christopher De Sa
Zhiru Zhang
MQ
30
1
0
13 Oct 2019
eCNN: A Block-Based and Highly-Parallel CNN Accelerator for Edge
  Inference
eCNN: A Block-Based and Highly-Parallel CNN Accelerator for Edge Inference
Chao-Tsung Huang
Yu-Chun Ding
Huan-Ching Wang
Chi-Wen Weng
Kai-Ping Lin
Li-Wei Wang
Li-De Chen
14
42
0
13 Oct 2019
EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network
  Inference Using Approximate DRAM
EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM
Skanda Koppula
Lois Orosa
A. G. Yaglikçi
Roknoddin Azizi
Taha Shahroodi
Konstantinos Kanellopoulos
O. Mutlu
27
105
0
12 Oct 2019
Inundation Modeling in Data Scarce Regions
Inundation Modeling in Data Scarce Regions
Z. Ben-Haim
V. Anisimov
Aaron Yonas
Varun Gulshan
Yusef Shafi
Stephan Hoyer
Sella Nevo
AI4CE
11
11
0
11 Oct 2019
PipeMare: Asynchronous Pipeline Parallel DNN Training
PipeMare: Asynchronous Pipeline Parallel DNN Training
Bowen Yang
Jian Zhang
Jonathan Li
Christopher Ré
Christopher R. Aberger
Christopher De Sa
22
110
0
09 Oct 2019
Synthesizing Credit Card Transactions
Synthesizing Credit Card Transactions
E. Altman
21
32
0
04 Oct 2019
Training Multiscale-CNN for Large Microscopy Image Classification in One
  Hour
Training Multiscale-CNN for Large Microscopy Image Classification in One Hour
Kushal Datta
Imtiaz Hossain
Sun Choi
V. Saletore
Kyle H. Ambert
William J. Godinez
Xian Zhang
6
4
0
03 Oct 2019
Accelerating Data Loading in Deep Neural Network Training
Accelerating Data Loading in Deep Neural Network Training
Chih-Chieh Yang
Guojing Cong
22
36
0
02 Oct 2019
MLPerf Training Benchmark
MLPerf Training Benchmark
Arya D. McCarthy
Christine Cheng
Cody Coleman
Greg Diamos
Paulius Micikevicius
...
Carole-Jean Wu
Lingjie Xu
Masafumi Yamazaki
C. Young
Matei A. Zaharia
47
307
0
02 Oct 2019
QuaRL: Quantization for Fast and Environmentally Sustainable
  Reinforcement Learning
QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning
Srivatsan Krishnan
Maximilian Lam
Sharad Chitlangia
Zishen Wan
Gabriel Barth-Maron
Aleksandra Faust
Vijay Janapa Reddi
MQ
29
23
0
02 Oct 2019
Accelerating Deep Learning by Focusing on the Biggest Losers
Accelerating Deep Learning by Focusing on the Biggest Losers
Angela H. Jiang
Daniel L.-K. Wong
Giulio Zhou
D. Andersen
J. Dean
...
Gauri Joshi
M. Kaminsky
M. Kozuch
Zachary Chase Lipton
Padmanabhan Pillai
19
119
0
02 Oct 2019
AdaptivFloat: A Floating-point based Data Type for Resilient Deep
  Learning Inference
AdaptivFloat: A Floating-point based Data Type for Resilient Deep Learning Inference
Thierry Tambe
En-Yu Yang
Zishen Wan
Yuntian Deng
Vijay Janapa Reddi
Alexander M. Rush
David Brooks
Gu-Yeon Wei
MQ
19
21
0
29 Sep 2019
Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator
Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator
Tian Zhao
Yaqi Zhang
K. Olukotun
33
16
0
26 Sep 2019
CAT: Compression-Aware Training for bandwidth reduction
CAT: Compression-Aware Training for bandwidth reduction
Chaim Baskin
Brian Chmiel
Evgenii Zheltonozhskii
Ron Banner
A. Bronstein
A. Mendelson
MQ
22
10
0
25 Sep 2019
Previous
123...161718...222324
Next