ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.04760
  4. Cited By
In-Datacenter Performance Analysis of a Tensor Processing Unit

In-Datacenter Performance Analysis of a Tensor Processing Unit

16 April 2017
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh Bhatia
Nan Boden
Al Borchers
Rick Boyle
Pierre-luc Cantin
Clifford Chao
Chris Clark
Jeremy Coriell
Mike Daley
Matt Dau
Jeffrey Dean
Ben Gelb
Taraneh Ghaemmaghami
Rajendra Gottipati
William Gulland
Robert Hagmann
C. Richard Ho
Doug Hogberg
John Hu
R. Hundt
Dan Hurt
Julian Ibarz
A. Jaffey
Alek Jaworski
Alexander Kaplan
Harshit Khaitan
Andy Koch
Naveen Kumar
Steve Lacy
James Laudon
James Law
Diemthu Le
Chris Leary
Zhuyuan Liu
Kyle Lucke
Alan Lundin
Gordon MacKean
Adriana Maggiore
Maire Mahony
Kieran Miller
R. Nagarajan
Ravi Narayanaswami
Ray Ni
Kathy Nix
Thomas Norrie
Mark Omernick
Narayana Penukonda
Andy Phelps
Jonathan Ross
Matt Ross
Amir Salek
Emad Samadiani
Chris Severn
Gregory Sizikov
Matthew Snelham
Jed Souter
Dan Steinberg
Andy Swing
Mercedes Tan
Gregory Thorson
Bo Tian
Horia Toma
Erick Tuttle
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
ArXivPDFHTML

Papers citing "In-Datacenter Performance Analysis of a Tensor Processing Unit"

50 / 1,164 papers shown
Title
Massively Parallel Continuous Local Search for Hybrid SAT Solving on
  GPUs
Massively Parallel Continuous Local Search for Hybrid SAT Solving on GPUs
Yunuo Cen
Zhiwei Zhang
Xuanyao Fong
26
1
0
29 Aug 2023
Throughput Maximization of DNN Inference: Batching or Multi-Tenancy?
Throughput Maximization of DNN Inference: Batching or Multi-Tenancy?
Seyed Morteza Nabavinejad
M. Ebrahimi
Sherief Reda
27
1
0
26 Aug 2023
TpuGraphs: A Performance Prediction Dataset on Large Tensor
  Computational Graphs
TpuGraphs: A Performance Prediction Dataset on Large Tensor Computational Graphs
P. Phothilimthana
Sami Abu-El-Haija
Kaidi Cao
Bahare Fatemi
Mike Burrows
Charith Mendis
Bryan Perozzi
GNN
AI4TS
33
17
0
25 Aug 2023
An Open-Source ML-Based Full-Stack Optimization Framework for Machine
  Learning Accelerators
An Open-Source ML-Based Full-Stack Optimization Framework for Machine Learning Accelerators
H. Esmaeilzadeh
Soroush Ghodrati
A. Kahng
Joo-Young Kim
Sean Kinzer
...
R. Mahapatra
Susmita Dey Manasi
S. Sapatnekar
Zhiang Wang
Ziqing Zeng
27
4
0
23 Aug 2023
Accelerating Exact Combinatorial Optimization via RL-based
  Initialization -- A Case Study in Scheduling
Accelerating Exact Combinatorial Optimization via RL-based Initialization -- A Case Study in Scheduling
Jiaqi Yin
Cunxi Yu
21
2
0
19 Aug 2023
A Survey of Spanish Clinical Language Models
A Survey of Spanish Clinical Language Models
Guillem García Subies
Á. Jiménez
Paloma Martínez
LM&MA
ELM
LRM
29
0
0
04 Aug 2023
DiviML: A Module-based Heuristic for Mapping Neural Networks onto
  Heterogeneous Platforms
DiviML: A Module-based Heuristic for Mapping Neural Networks onto Heterogeneous Platforms
Yassine Ghannane
Mohamed S. Abdelfattah
13
2
0
31 Jul 2023
HUGE: Huge Unsupervised Graph Embeddings with TPUs
HUGE: Huge Unsupervised Graph Embeddings with TPUs
Brandon Mayer
Anton Tsitsulin
Hendrik Fichtenberger
Jonathan J. Halcrow
Bryan Perozzi
GNN
13
1
0
26 Jul 2023
Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights
  Generation
Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights Generation
Stylianos I. Venieris
Javier Fernandez-Marques
Nicholas D. Lane
MQ
29
3
0
25 Jul 2023
Leveraging Deep Learning and Online Source Sentiment for Financial
  Portfolio Management
Leveraging Deep Learning and Online Source Sentiment for Financial Portfolio Management
K. Srivatsan
Loukia Avramelou
Georgios Rodinos
Maria Tzelepi
Muzammal Naseer
...
Manos Kirtas
Pavlos Tosidis
Avraam Tsantekidis
Nikolaos Passalis
Anastasios Tefas
AIFin
40
2
0
23 Jul 2023
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
Vasileios Leon
Muhammad Abdullah Hanif
Giorgos Armeniakos
Xun Jiao
Muhammad Shafique
K. Pekmestzi
Dimitrios Soudris
42
3
0
20 Jul 2023
MGit: A Model Versioning and Management System
MGit: A Model Versioning and Management System
Wei Hao
Daniel Mendoza
Rafael Ferreira da Silva
Deepak Narayanan
Amar Phanishayee
VLM
27
1
0
14 Jul 2023
Rational Neural Network Controllers
Rational Neural Network Controllers
M. Newton
A. Papachristodoulou
OOD
AAML
37
1
0
12 Jul 2023
MG3MConv: Multi-Grained Matrix-Multiplication-Mapping Convolution
  Algorithm toward the SW26010 Processor
MG3MConv: Multi-Grained Matrix-Multiplication-Mapping Convolution Algorithm toward the SW26010 Processor
Zheng-Kuo Wu
20
1
0
11 Jul 2023
Performance Analysis of DNN Inference/Training with Convolution and
  non-Convolution Operations
Performance Analysis of DNN Inference/Training with Convolution and non-Convolution Operations
H. Esmaeilzadeh
Soroush Ghodrati
A. Kahng
Sean Kinzer
Susmita Dey Manasi
S. Sapatnekar
Zhiang Wang
27
2
0
29 Jun 2023
CIMulator: A Comprehensive Simulation Platform for Computing-In-Memory
  Circuit Macros with Low Bit-Width and Real Memory Materials
CIMulator: A Comprehensive Simulation Platform for Computing-In-Memory Circuit Macros with Low Bit-Width and Real Memory Materials
Hoang-Hiep Le
M. Baig
Wei-Chen Hong
Chengshian Tsai
Cheng-Jui Yeh
...
Nan-Yow Chen
Wen-Jay Lee
Ing-Chao Lin
Da-Wei Chang
D. Lu
18
1
0
26 Jun 2023
Accelerating SNN Training with Stochastic Parallelizable Spiking Neurons
Accelerating SNN Training with Stochastic Parallelizable Spiking Neurons
Sidi Yaya Arnaud Yarga
Sean U. N. Wood
16
8
0
22 Jun 2023
Subgraph Stationary Hardware-Software Inference Co-Design
Subgraph Stationary Hardware-Software Inference Co-Design
Payman Behnam
Jianming Tong
Alind Khare
Yang Chen
Yue Pan
Pranav Gadikar
Abhimanyu Bambhaniya
T. Krishna
Alexey Tumanov
25
4
0
21 Jun 2023
Opportunities of Renewable Energy Powered DNN Inference
Opportunities of Renewable Energy Powered DNN Inference
Seyed Morteza Nabavinejad
Tian Guo
AI4CE
26
2
0
21 Jun 2023
DGEMM on Integer Matrix Multiplication Unit
DGEMM on Integer Matrix Multiplication Unit
Hiroyuki Ootomo
K. Ozaki
Rio Yokota
17
12
0
21 Jun 2023
ArchGym: An Open-Source Gymnasium for Machine Learning Assisted
  Architecture Design
ArchGym: An Open-Source Gymnasium for Machine Learning Assisted Architecture Design
Srivatsan Krishnan
Amir Yazdanbaksh
Shvetank Prakash
Jason J. Jabbour
Ikechukwu Uchendu
...
Behzad Boroujerdian
Daniel Richins
Devashree Tripathy
Aleksandra Faust
Vijay Janapa Reddi
47
12
0
15 Jun 2023
KAPLA: Pragmatic Representation and Fast Solving of Scalable NN
  Accelerator Dataflow
KAPLA: Pragmatic Representation and Fast Solving of Scalable NN Accelerator Dataflow
Zhiyao Li
Mingyu Gao
21
1
0
09 Jun 2023
Revisiting Neural Retrieval on Accelerators
Revisiting Neural Retrieval on Accelerators
Jiaqi Zhai
Zhaojie Gong
Yueming Wang
Xiao Sun
Zheng Yan
Fu Li
Xing Liu
15
10
0
06 Jun 2023
Streaming Task Graph Scheduling for Dataflow Architectures
Streaming Task Graph Scheduling for Dataflow Architectures
T. De Matteis
Lukas Gianinazzi
Johannes de Fine Licht
Torsten Hoefler
GNN
19
3
0
05 Jun 2023
Edit Distance based RL for RNNT decoding
Edit Distance based RL for RNNT decoding
DongSeon Hwang
Changwan Ryu
K. Sim
21
0
0
31 May 2023
Intriguing Properties of Quantization at Scale
Intriguing Properties of Quantization at Scale
Arash Ahmadian
Saurabh Dash
Hongyu Chen
Bharat Venkitesh
Stephen Gou
Phil Blunsom
Ahmet Üstün
Sara Hooker
MQ
54
38
0
30 May 2023
NicePIM: Design Space Exploration for Processing-In-Memory DNN
  Accelerators with 3D-Stacked-DRAM
NicePIM: Design Space Exploration for Processing-In-Memory DNN Accelerators with 3D-Stacked-DRAM
Junpeng Wang
Mengke Ge
Bo Ding
Qi Xu
Song Chen
Yi Kang
30
5
0
30 May 2023
Global-QSGD: Practical Floatless Quantization for Distributed Learning
  with Theoretical Guarantees
Global-QSGD: Practical Floatless Quantization for Distributed Learning with Theoretical Guarantees
Jihao Xin
Marco Canini
Peter Richtárik
Samuel Horváth
36
2
0
29 May 2023
Multiplication-Free Transformer Training via Piecewise Affine Operations
Multiplication-Free Transformer Training via Piecewise Affine Operations
Atli Kosson
Martin Jaggi
21
4
0
26 May 2023
PQA: Exploring the Potential of Product Quantization in DNN Hardware
  Acceleration
PQA: Exploring the Potential of Product Quantization in DNN Hardware Acceleration
Ahmed F. AbouElhamayed
Angela Cui
Javier Fernandez-Marques
Nicholas D. Lane
Mohamed S. Abdelfattah
MQ
29
4
0
25 May 2023
NeuralMatrix: Compute the Entire Neural Networks with Linear Matrix
  Operations for Efficient Inference
NeuralMatrix: Compute the Entire Neural Networks with Linear Matrix Operations for Efficient Inference
Ruiqi Sun
Siwei Ye
Jie Zhao
Xin He
Yiran Li
An Zou
35
0
0
23 May 2023
HighLight: Efficient and Flexible DNN Acceleration with Hierarchical
  Structured Sparsity
HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity
Yannan Nellie Wu
Po-An Tsai
Saurav Muralidharan
A. Parashar
Vivienne Sze
J. Emer
34
23
0
22 May 2023
FAQ: Mitigating the Impact of Faults in the Weight Memory of DNN
  Accelerators through Fault-Aware Quantization
FAQ: Mitigating the Impact of Faults in the Weight Memory of DNN Accelerators through Fault-Aware Quantization
Muhammad Abdullah Hanif
Muhammad Shafique
AAML
39
2
0
21 May 2023
ProgSG: Cross-Modality Representation Learning for Programs in Electronic Design Automation
Yunsheng Bai
Atefeh Sohrabizadeh
Zongyue Qin
Ziniu Hu
Yizhou Sun
Jason Cong
26
1
0
18 May 2023
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Chong Yu
Tao Chen
Zhongxue Gan
Jiayuan Fan
MQ
ViT
30
23
0
18 May 2023
Fast Matrix Multiplication via Compiler-only Layered Data Reorganization
  and Intrinsic Lowering
Fast Matrix Multiplication via Compiler-only Layered Data Reorganization and Intrinsic Lowering
Braedy Kuzma
Ivan Korostelev
J. P. L. Carvalho
José Moreira
Christopher Barton
Guido Araujo
J. N. Amaral
11
3
0
15 May 2023
MoCA: Memory-Centric, Adaptive Execution for Multi-Tenant Deep Neural
  Networks
MoCA: Memory-Centric, Adaptive Execution for Multi-Tenant Deep Neural Networks
Seah Kim
Hasan Genç
Vadim Nikiforov
Krste Asanović
B. Nikolić
Y. Shao
27
18
0
10 May 2023
A Systematic Literature Review on Hardware Reliability Assessment
  Methods for Deep Neural Networks
A Systematic Literature Review on Hardware Reliability Assessment Methods for Deep Neural Networks
Mohammad Hasan Ahmadilivani
Mahdi Taheri
J. Raik
Masoud Daneshtalab
M. Jenihhin
37
25
0
09 May 2023
Energy-Latency Attacks to On-Device Neural Networks via Sponge Poisoning
Energy-Latency Attacks to On-Device Neural Networks via Sponge Poisoning
Zijian Wang
Shuo Huang
Yu-Jen Huang
Helei Cui
SILM
27
10
0
06 May 2023
Hardware Acceleration of Explainable Artificial Intelligence
Hardware Acceleration of Explainable Artificial Intelligence
Zhixin Pan
Prabhat Mishra
26
0
0
04 May 2023
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive
  Transformer APIs
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs
Deepak Narayanan
Keshav Santhanam
Peter Henderson
Rishi Bommasani
Tony Lee
Percy Liang
145
3
0
03 May 2023
Rubik's Optical Neural Networks: Multi-task Learning with Physics-aware
  Rotation Architecture
Rubik's Optical Neural Networks: Multi-task Learning with Physics-aware Rotation Architecture
Yingjie Li
Weilu Gao
Cunxi Yu
30
3
0
25 Apr 2023
SALSA: Simulated Annealing based Loop-Ordering Scheduler for DNN
  Accelerators
SALSA: Simulated Annealing based Loop-Ordering Scheduler for DNN Accelerators
Victor J. B. Jung
Arne Symons
L. Mei
Marian Verhelst
Luca Benini
21
3
0
20 Apr 2023
eFAT: Improving the Effectiveness of Fault-Aware Training for Mitigating
  Permanent Faults in DNN Hardware Accelerators
eFAT: Improving the Effectiveness of Fault-Aware Training for Mitigating Permanent Faults in DNN Hardware Accelerators
Muhammad Abdullah Hanif
Muhammad Shafique
11
2
0
20 Apr 2023
Massive Data-Centric Parallelism in the Chiplet Era
Massive Data-Centric Parallelism in the Chiplet Era
Marcelo Orenes-Vera
Esin Tureci
D. Wentzlaff
M. Martonosi
21
6
0
19 Apr 2023
Heterogeneous Integration of In-Memory Analog Computing Architectures
  with Tensor Processing Units
Heterogeneous Integration of In-Memory Analog Computing Architectures with Tensor Processing Units
Mohammed E. Elbtity
Brendan Reidy
Md Hasibul Amin
Ramtin Zand
24
5
0
18 Apr 2023
Speck: A Smart event-based Vision Sensor with a low latency 327K Neuron
  Convolutional Neuronal Network Processing Pipeline
Speck: A Smart event-based Vision Sensor with a low latency 327K Neuron Convolutional Neuronal Network Processing Pipeline
Ole Richter
Y. Xing
M. D. Marchi
Carsten Nielsen
M. Katsimpris
...
SynSense
Bio-Inspired Circuits
Sadique Sheik
T. Demirci
Groningen Cognitive Systems
27
55
0
13 Apr 2023
Training Large Language Models Efficiently with Sparsity and Dataflow
Training Large Language Models Efficiently with Sparsity and Dataflow
V. Srinivasan
Darshan Gandhi
Urmish Thakker
R. Prabhakar
MoE
38
6
0
11 Apr 2023
SamurAI: A Versatile IoT Node With Event-Driven Wake-Up and Embedded ML
  Acceleration
SamurAI: A Versatile IoT Node With Event-Driven Wake-Up and Embedded ML Acceleration
I. Miro-Panadès
Benoît Tain
J. Christmann
David Coriat
R. Lemaire
...
Jean-Marc Philippe
Y. Thonnart
A. Valentian
Frédéric Heitzmann
F. Clermidy
22
12
0
11 Apr 2023
Mixed-Precision Random Projection for RandNLA on Tensor Cores
Mixed-Precision Random Projection for RandNLA on Tensor Cores
Hiroyuki Ootomo
Rio Yokota
19
3
0
10 Apr 2023
Previous
12345...222324
Next