ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1911.06859
  4. Cited By
NeuMMU: Architectural Support for Efficient Address Translations in
  Neural Processing Units

NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units

15 November 2019
Bongjoon Hyun
Youngeun Kwon
Yujeong Choi
John Kim
Minsoo Rhu
ArXivPDFHTML

Papers citing "NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units"

22 / 22 papers shown
Title
PREMA: A Predictive Multi-task Scheduling Algorithm For Preemptible
  Neural Processing Units
PREMA: A Predictive Multi-task Scheduling Algorithm For Preemptible Neural Processing Units
Yujeong Choi
Minsoo Rhu
18
128
0
06 Sep 2019
Beyond Human-Level Accuracy: Computational Challenges in Deep Learning
Beyond Human-Level Accuracy: Computational Challenges in Deep Learning
Joel Hestness
Newsha Ardalani
G. Diamos
31
67
0
03 Sep 2019
TensorDIMM: A Practical Near-Memory Processing Architecture for
  Embeddings and Tensor Operations in Deep Learning
TensorDIMM: A Practical Near-Memory Processing Architecture for Embeddings and Tensor Operations in Deep Learning
Youngeun Kwon
Yunjae Lee
Minsoo Rhu
35
208
0
08 Aug 2019
The Architectural Implications of Facebook's DNN-based Personalized
  Recommendation
The Architectural Implications of Facebook's DNN-based Personalized Recommendation
Udit Gupta
Carole-Jean Wu
Xiaodong Wang
Maxim Naumov
Brandon Reagen
...
Andrey Malevich
Dheevatsa Mudigere
M. Smelyanskiy
Liang Xiong
Xuan Zhang
GNN
65
290
0
06 Jun 2019
Deep Learning Recommendation Model for Personalization and
  Recommendation Systems
Deep Learning Recommendation Model for Personalization and Recommendation Systems
Maxim Naumov
Dheevatsa Mudigere
Hao-Jun Michael Shi
Jianyu Huang
Narayanan Sundaraman
...
Wenlin Chen
Vijay Rao
Bill Jia
Liang Xiong
M. Smelyanskiy
60
726
0
31 May 2019
Beyond the Memory Wall: A Case for Memory-centric HPC System for Deep
  Learning
Beyond the Memory Wall: A Case for Memory-centric HPC System for Deep Learning
Youngeun Kwon
Minsoo Rhu
34
57
0
18 Feb 2019
Deep Learning Inference in Facebook Data Centers: Characterization,
  Performance Optimizations and Hardware Implications
Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications
Jongsoo Park
Maxim Naumov
Protonu Basu
Summer Deng
Aravind Kalaiah
...
Lin Qiao
Vijay Rao
Nadav Rotem
S. Yoo
M. Smelyanskiy
FedML
GNN
BDL
50
187
0
24 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
943
93,936
0
11 Oct 2018
Bit-Tactical: Exploiting Ineffectual Computations in Convolutional
  Neural Networks: Which, Why, and How
Bit-Tactical: Exploiting Ineffectual Computations in Convolutional Neural Networks: Which, Why, and How
A. Delmas
Patrick Judd
Dylan Malone Stuart
Zissis Poulos
Mostafa Mahmoud
Sayeh Sharify
Milos Nikolic
Andreas Moshovos
39
24
0
09 Mar 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
435
129,831
0
12 Jun 2017
SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks
SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks
A. Parashar
Minsoo Rhu
Anurag Mukkara
A. Puglielli
Rangharajan Venkatesan
Brucek Khailany
J. Emer
S. Keckler
W. Dally
54
1,122
0
23 May 2017
Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep
  Neural Networks
Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks
Minsoo Rhu
Mike O'Connor
Niladrish Chatterjee
Jeff Pool
S. Keckler
42
176
0
03 May 2017
In-Datacenter Performance Analysis of a Tensor Processing Unit
In-Datacenter Performance Analysis of a Tensor Processing Unit
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
...
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
170
4,619
0
16 Apr 2017
Bit-pragmatic Deep Neural Network Computing
Bit-pragmatic Deep Neural Network Computing
Jorge Albericio
Patrick Judd
A. Delmas
Sayeh Sharify
Andreas Moshovos
MQ
54
239
0
20 Oct 2016
Accelerating Deep Convolutional Networks using low-precision and
  sparsity
Accelerating Deep Convolutional Networks using low-precision and sparsity
Ganesh Venkatesh
Eriko Nurvitadhi
Debbie Marr
56
135
0
02 Oct 2016
Exploring the Limits of Language Modeling
Exploring the Limits of Language Modeling
Rafal Jozefowicz
Oriol Vinyals
M. Schuster
Noam M. Shazeer
Yonghui Wu
118
1,143
0
07 Feb 2016
EIE: Efficient Inference Engine on Compressed Deep Neural Network
EIE: Efficient Inference Engine on Compressed Deep Neural Network
Song Han
Xingyu Liu
Huizi Mao
Jing Pu
A. Pedram
M. Horowitz
W. Dally
102
2,453
0
04 Feb 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.4K
192,638
0
10 Dec 2015
Listen, Attend and Spell
Listen, Attend and Spell
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
126
2,261
0
05 Aug 2015
Neural Turing Machines
Neural Turing Machines
Alex Graves
Greg Wayne
Ivo Danihelka
64
2,318
0
20 Oct 2014
Going Deeper with Convolutions
Going Deeper with Convolutions
Christian Szegedy
Wei Liu
Yangqing Jia
P. Sermanet
Scott E. Reed
Dragomir Anguelov
D. Erhan
Vincent Vanhoucke
Andrew Rabinovich
299
43,511
0
17 Sep 2014
One weird trick for parallelizing convolutional neural networks
One weird trick for parallelizing convolutional neural networks
A. Krizhevsky
GNN
74
1,297
0
23 Apr 2014
1