ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.04760
  4. Cited By
In-Datacenter Performance Analysis of a Tensor Processing Unit

In-Datacenter Performance Analysis of a Tensor Processing Unit

16 April 2017
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh Bhatia
Nan Boden
Al Borchers
Rick Boyle
Pierre-luc Cantin
Clifford Chao
Chris Clark
Jeremy Coriell
Mike Daley
Matt Dau
Jeffrey Dean
Ben Gelb
Taraneh Ghaemmaghami
Rajendra Gottipati
William Gulland
Robert Hagmann
C. Richard Ho
Doug Hogberg
John Hu
R. Hundt
Dan Hurt
Julian Ibarz
A. Jaffey
Alek Jaworski
Alexander Kaplan
Harshit Khaitan
Andy Koch
Naveen Kumar
Steve Lacy
James Laudon
James Law
Diemthu Le
Chris Leary
Zhuyuan Liu
Kyle Lucke
Alan Lundin
Gordon MacKean
Adriana Maggiore
Maire Mahony
Kieran Miller
R. Nagarajan
Ravi Narayanaswami
Ray Ni
Kathy Nix
Thomas Norrie
Mark Omernick
Narayana Penukonda
Andy Phelps
Jonathan Ross
Matt Ross
Amir Salek
Emad Samadiani
Chris Severn
Gregory Sizikov
Matthew Snelham
Jed Souter
Dan Steinberg
Andy Swing
Mercedes Tan
Gregory Thorson
Bo Tian
Horia Toma
Erick Tuttle
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
ArXivPDFHTML

Papers citing "In-Datacenter Performance Analysis of a Tensor Processing Unit"

50 / 1,164 papers shown
Title
Let Your Graph Do the Talking: Encoding Structured Data for LLMs
Let Your Graph Do the Talking: Encoding Structured Data for LLMs
Bryan Perozzi
Bahare Fatemi
Dustin Zelle
Anton Tsitsulin
Mehran Kazemi
Rami Al-Rfou
Jonathan J. Halcrow
GNN
42
55
0
08 Feb 2024
Training DNN Models over Heterogeneous Clusters with Optimal Performance
Training DNN Models over Heterogeneous Clusters with Optimal Performance
Chengyi Nie
Jessica Maghakian
Zhenhua Liu
26
0
0
07 Feb 2024
Expediting In-Network Federated Learning by Voting-Based Consensus Model
  Compression
Expediting In-Network Federated Learning by Voting-Based Consensus Model Compression
Xiaoxin Su
Yipeng Zhou
Laizhong Cui
Song Guo
FedML
30
3
0
06 Feb 2024
HEANA: A Hybrid Time-Amplitude Analog Optical Accelerator with Flexible
  Dataflows for Energy-Efficient CNN Inference
HEANA: A Hybrid Time-Amplitude Analog Optical Accelerator with Flexible Dataflows for Energy-Efficient CNN Inference
Sairam Sri Vatsavai
Venkata Sai Praneeth Karempudi
Ishan G. Thakkar
32
0
0
05 Feb 2024
A Comparative Analysis of Microrings Based Incoherent Photonic GEMM
  Accelerators
A Comparative Analysis of Microrings Based Incoherent Photonic GEMM Accelerators
Sairam Sri Vatsavai
Venkata Sai Praneeth Karempudi
Oluwaseun Adewunmi Alo
Ishan G. Thakkar
26
2
0
05 Feb 2024
ClipFormer: Key-Value Clipping of Transformers on Memristive Crossbars
  for Write Noise Mitigation
ClipFormer: Key-Value Clipping of Transformers on Memristive Crossbars for Write Noise Mitigation
Abhiroop Bhattacharjee
Abhishek Moitra
Priyadarshini Panda
CLIP
24
6
0
04 Feb 2024
Data-Oblivious ML Accelerators using Hardware Security Extensions
Data-Oblivious ML Accelerators using Hardware Security Extensions
Hossam ElAtali
John Z. Jekel
Lachlan J. Gunn
N. Asokan
14
0
0
29 Jan 2024
Digital-analog hybrid matrix multiplication processor for optical neural
  networks
Digital-analog hybrid matrix multiplication processor for optical neural networks
Xiansong Meng
Deming Kong
K. Kim
Qiuchi Li
Po Dong
Ingemar J. Cox
Christina Lioma
Hao Hu
19
0
0
26 Jan 2024
PartIR: Composing SPMD Partitioning Strategies for Machine Learning
PartIR: Composing SPMD Partitioning Strategies for Machine Learning
Sami Alabed
Daniel Belov
Bart Chrzaszcz
Juliana Franco
Dominik Grewe
...
Michael Schaarschmidt
Timur Sitdikov
Agnieszka Swietlik
Dimitrios Vytiniotis
Joel Wee
33
3
0
20 Jan 2024
Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision
  Quantization
Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization
K. Balaskas
Andreas Karatzas
Christos Sad
K. Siozios
Iraklis Anagnostopoulos
Georgios Zervakis
Jörg Henkel
MQ
41
10
0
23 Dec 2023
Attention, Distillation, and Tabularization: Towards Practical Neural
  Network-Based Prefetching
Attention, Distillation, and Tabularization: Towards Practical Neural Network-Based Prefetching
Pengmiao Zhang
Neelesh Gupta
Rajgopal Kannan
Viktor K. Prasanna
48
0
0
23 Dec 2023
Experimental demonstration of magnetic tunnel junction-based
  computational random-access memory
Experimental demonstration of magnetic tunnel junction-based computational random-access memory
Yang Lv
Brandon R. Zink
Robert P. Bloom
Husrev Cilasun
Pravin Khanal
...
Ali T. Habiboglu
Weigang Wang
S. Sapatnekar
Ulya R. Karpuzcu
Jian-Ping Wang
11
7
0
21 Dec 2023
Muchisim: A Simulation Framework for Design Exploration of Multi-Chip
  Manycore Systems
Muchisim: A Simulation Framework for Design Exploration of Multi-Chip Manycore Systems
Marcelo Orenes-Vera
Esin Tureci
M. Martonosi
D. Wentzlaff
24
7
0
15 Dec 2023
Bad Students Make Great Teachers: Active Learning Accelerates
  Large-Scale Visual Understanding
Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding
Talfan Evans
Shreya Pathak
Hamza Merzic
Jonathan Schwarz
Ryutaro Tanno
Olivier J. Hénaff
20
16
0
08 Dec 2023
Tenplex: Dynamic Parallelism for Deep Learning using Parallelizable
  Tensor Collections
Tenplex: Dynamic Parallelism for Deep Learning using Parallelizable Tensor Collections
Marcel Wagenlander
Guo Li
Bo Zhao
Luo Mai
Peter R. Pietzuch
43
7
0
08 Dec 2023
On The Fairness Impacts of Hardware Selection in Machine Learning
On The Fairness Impacts of Hardware Selection in Machine Learning
Sree Harsha Nelaturu
Nishaanth Kanna Ravichandran
Cuong Tran
Sara Hooker
Ferdinando Fioretto
53
3
0
06 Dec 2023
The Landscape of Modern Machine Learning: A Review of Machine,
  Distributed and Federated Learning
The Landscape of Modern Machine Learning: A Review of Machine, Distributed and Federated Learning
Omer Subasi
Oceane Bel
Joseph Manzano
Kevin J. Barker
FedML
OOD
PINN
28
2
0
05 Dec 2023
Visual Program Distillation: Distilling Tools and Programmatic Reasoning
  into Vision-Language Models
Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models
Yushi Hu
Otilia Stretcu
Chun-Ta Lu
Krishnamurthy Viswanathan
Kenji Hata
Enming Luo
Ranjay Krishna
Ariel Fuxman
VLM
LRM
MLLM
52
30
0
05 Dec 2023
Using Large Language Models to Accelerate Communication for Users with
  Severe Motor Impairments
Using Large Language Models to Accelerate Communication for Users with Severe Motor Impairments
Shanqing Cai
Subhashini Venugopalan
Katie Seaver
Xiang Xiao
Katrin Tomanek
...
Daniel E Vance
Blair Casey
Steve M. Gleason
Philip Q. Nelson
Michael P. Brenner
30
7
0
03 Dec 2023
Monitor Placement for Fault Localization in Deep Neural Network Accelerators
Wei-Kai Liu
30
0
0
28 Nov 2023
Tascade: Hardware Support for Atomic-free, Asynchronous and Efficient
  Reduction Trees
Tascade: Hardware Support for Atomic-free, Asynchronous and Efficient Reduction Trees
Marcelo Orenes-Vera
Esin Tureci
D. Wentzlaff
M. Martonosi
16
2
0
27 Nov 2023
Learning to Skip for Language Modeling
Learning to Skip for Language Modeling
Dewen Zeng
Nan Du
Tao Wang
Yuanzhong Xu
Tao Lei
Zhifeng Chen
Claire Cui
25
11
0
26 Nov 2023
Large Language Models in Law: A Survey
Large Language Models in Law: A Survey
Jinqi Lai
Wensheng Gan
Jiayang Wu
Zhenlian Qi
Philip S. Yu
ELM
AILaw
34
72
0
26 Nov 2023
Locally Optimal Descent for Dynamic Stepsize Scheduling
Locally Optimal Descent for Dynamic Stepsize Scheduling
Gilad Yehudai
Alon Cohen
Amit Daniely
Yoel Drori
Tomer Koren
Mariano Schain
37
0
0
23 Nov 2023
REDS: Resource-Efficient Deep Subnetworks for Dynamic Resource
  Constraints
REDS: Resource-Efficient Deep Subnetworks for Dynamic Resource Constraints
Francesco Corti
Balz Maag
Joachim Schauer
U. Pferschy
O. Saukh
37
2
0
22 Nov 2023
Fast Inner-Product Algorithms and Architectures for Deep Neural Network
  Accelerators
Fast Inner-Product Algorithms and Architectures for Deep Neural Network Accelerators
Trevor E. Pogue
N. Nicolici
30
3
0
20 Nov 2023
Tensor-Aware Energy Accounting
Tensor-Aware Energy Accounting
Timur Babakol
Yu David Liu
21
3
0
19 Nov 2023
DLAS: An Exploration and Assessment of the Deep Learning Acceleration
  Stack
DLAS: An Exploration and Assessment of the Deep Learning Acceleration Stack
Perry Gibson
José Cano
Elliot J. Crowley
Amos Storkey
Michael F. P. O'Boyle
27
1
0
15 Nov 2023
Harnessing Manycore Processors with Distributed Memory for Accelerated
  Training of Sparse and Recurrent Models
Harnessing Manycore Processors with Distributed Memory for Accelerated Training of Sparse and Recurrent Models
Jan Finkbeiner
Thomas Gmeinder
M. Pupilli
A. Titterton
Emre Neftci
29
3
0
07 Nov 2023
Practical Performance Guarantees for Pipelined DNN Inference
Practical Performance Guarantees for Pipelined DNN Inference
Aaron Archer
Matthew Fahrbach
Kuikui Liu
Prakash Prabhu
29
0
0
07 Nov 2023
Remaining useful life prediction of Lithium-ion batteries using
  spatio-temporal multimodal attention networks
Remaining useful life prediction of Lithium-ion batteries using spatio-temporal multimodal attention networks
Sungho Suh
D. Mittal
Hymalai Bello
Bo Zhou
M. Jha
P. Lukowicz
22
3
0
29 Oct 2023
Restoring the Broken Covenant Between Compilers and Deep Learning
  Accelerators
Restoring the Broken Covenant Between Compilers and Deep Learning Accelerators
Sean Kinzer
Soroush Ghodrati
R. Mahapatra
Byung Hoon Ahn
Edwin Mascarenhas
Xiaolong Li
J. Matai
Liang Zhang
H. Esmaeilzadeh
27
2
0
27 Oct 2023
GEVO-ML: Optimizing Machine Learning Code with Evolutionary Computation
GEVO-ML: Optimizing Machine Learning Code with Evolutionary Computation
Jhe-Yu Liou
Stephanie Forrest
Carole-Jean Wu
VLM
19
0
0
16 Oct 2023
Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models
Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models
Wenqi Jiang
Marco Zeller
R. Waleffe
Torsten Hoefler
Gustavo Alonso
54
14
0
15 Oct 2023
Accelerating Machine Learning Primitives on Commodity Hardware
Accelerating Machine Learning Primitives on Commodity Hardware
R. Snytsar
31
0
0
08 Oct 2023
mlirSynth: Automatic, Retargetable Program Raising in Multi-Level IR
  using Program Synthesis
mlirSynth: Automatic, Retargetable Program Raising in Multi-Level IR using Program Synthesis
Alexander Brauckmann
Elizabeth Polgreen
Tobias Grosser
Michael F. P. O'Boyle
8
2
0
06 Oct 2023
MAD Max Beyond Single-Node: Enabling Large Machine Learning Model
  Acceleration on Distributed Systems
MAD Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems
Samuel Hsia
Alicia Golden
Bilge Acun
Newsha Ardalani
Zach DeVito
Gu-Yeon Wei
David Brooks
Carole-Jean Wu
MoE
53
9
0
04 Oct 2023
Photonic Accelerators for Image Segmentation in Autonomous Driving and
  Defect Detection
Photonic Accelerators for Image Segmentation in Autonomous Driving and Defect Detection
Lakshmi Nair
David Widemann
Brad Turcott
Nick Moore
Alexandra Wleklinski
D. Bunandar
Ioannis Papavasileiou
Shihu Wang
Eric Logan
24
0
0
28 Sep 2023
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Albert Mohwald
34
15
0
28 Sep 2023
Small-scale proxies for large-scale Transformer training instabilities
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman
Peter J. Liu
Lechao Xiao
Katie Everett
A. Alemi
...
Jascha Narain Sohl-Dickstein
Kelvin Xu
Jaehoon Lee
Justin Gilmer
Simon Kornblith
40
84
0
25 Sep 2023
Probabilistic Weight Fixing: Large-scale training of neural network
  weight uncertainties for quantization
Probabilistic Weight Fixing: Large-scale training of neural network weight uncertainties for quantization
Christopher Subia-Waud
S. Dasmahapatra
UQCV
MQ
21
0
0
24 Sep 2023
Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and
  Dataflow Co-Design
Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-Design
Chao Fang
Wei Sun
Aojun Zhou
Zhongfeng Wang
19
3
0
22 Sep 2023
A Machine Learning-oriented Survey on Tiny Machine Learning
A Machine Learning-oriented Survey on Tiny Machine Learning
Luigi Capogrosso
Federico Cunico
D. Cheng
Franco Fummi
Marco Cristani
SyDa
MU
32
34
0
21 Sep 2023
Logic Design of Neural Networks for High-Throughput and Low-Power
  Applications
Logic Design of Neural Networks for High-Throughput and Low-Power Applications
Kangwei Xu
Grace Li Zhang
Ulf Schlichtmann
Bing Li
30
3
0
19 Sep 2023
USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained
  Foundation Models
USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models
Guanlong Zhao
Yongqiang Wang
Jason W. Pelecanos
Yu Zhang
Hank Liao
Yiling Huang
Han Lu
Quan Wang
25
4
0
14 Sep 2023
Autotuning Apache TVM-based Scientific Applications Using Bayesian
  Optimization
Autotuning Apache TVM-based Scientific Applications Using Bayesian Optimization
Xingfu Wu
P. Paramasivam
Valerie Taylor
21
3
0
13 Sep 2023
The Grand Illusion: The Myth of Software Portability and Implications
  for ML Progress
The Grand Illusion: The Myth of Software Portability and Implications for ML Progress
Fraser Mince
Dzung Dinh
Jonas Kgomo
Neil Thompson
Sara Hooker
19
6
0
12 Sep 2023
Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI
Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI
Dustin Wright
Christian Igel
Gabrielle Samuel
Raghavendra Selvan
35
15
0
05 Sep 2023
LoopTune: Optimizing Tensor Computations with Reinforcement Learning
LoopTune: Optimizing Tensor Computations with Reinforcement Learning
Dejan Grubisic
Bram Wasti
Chris Cummins
John Mellor-Crummey
A. Zlateski
27
0
0
04 Sep 2023
Reducing shared memory footprint to leverage high throughput on Tensor
  Cores and its flexible API extension library
Reducing shared memory footprint to leverage high throughput on Tensor Cores and its flexible API extension library
Hiroyuki Ootomo
Rio Yokota
13
7
0
29 Aug 2023
Previous
123456...222324
Next