Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.07424
Cited By
Hercules: Heterogeneity-Aware Inference Serving for At-Scale Personalized Recommendation
14 March 2022
Liu Ke
Udit Gupta
Mark Hempstead
Carole-Jean Wu
Hsien-Hsin S. Lee
Xuan Zhang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hercules: Heterogeneity-Aware Inference Serving for At-Scale Personalized Recommendation"
15 / 15 papers shown
Title
LazyBatching: An SLA-aware Batching System for Cloud Machine Learning Inference
Yujeong Choi
Yunseong Kim
Minsoo Rhu
44
66
0
25 Oct 2020
Tensor Casting: Co-Designing Algorithm-Architecture for Personalized Recommendation Training
Youngeun Kwon
Yunjae Lee
Minsoo Rhu
48
40
0
25 Oct 2020
Cross-Stack Workload Characterization of Deep Recommendation Systems
Samuel Hsia
Udit Gupta
Mark Wilkening
Carole-Jean Wu
Gu-Yeon Wei
David Brooks
BDL
GNN
HAI
112
32
0
10 Oct 2020
DeepRecSys: A System for Optimizing End-To-End At-scale Neural Recommendation Inference
Udit Gupta
Samuel Hsia
V. Saraph
Xiaodong Wang
Brandon Reagen
Gu-Yeon Wei
Hsien-Hsin S. Lee
David Brooks
Carole-Jean Wu
GNN
58
189
0
08 Jan 2020
RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing
Liu Ke
Udit Gupta
Carole-Jean Wu
B. Cho
Mark Hempstead
...
Dheevatsa Mudigere
Maxim Naumov
Martin D. Schatz
M. Smelyanskiy
Xiaodong Wang
53
217
0
30 Dec 2019
TensorDIMM: A Practical Near-Memory Processing Architecture for Embeddings and Tensor Operations in Deep Learning
Youngeun Kwon
Yunjae Lee
Minsoo Rhu
43
210
0
08 Aug 2019
The Architectural Implications of Facebook's DNN-based Personalized Recommendation
Udit Gupta
Carole-Jean Wu
Xiaodong Wang
Maxim Naumov
Brandon Reagen
...
Andrey Malevich
Dheevatsa Mudigere
M. Smelyanskiy
Liang Xiong
Xuan Zhang
GNN
73
290
0
06 Jun 2019
Deep Learning Recommendation Model for Personalization and Recommendation Systems
Maxim Naumov
Dheevatsa Mudigere
Hao-Jun Michael Shi
Jianyu Huang
Narayanan Sundaraman
...
Wenlin Chen
Vijay Rao
Bill Jia
Liang Xiong
M. Smelyanskiy
85
733
0
31 May 2019
Bandana: Using Non-volatile Memory for Storing Deep Learning Models
Assaf Eisenman
Maxim Naumov
Darryl Gardner
M. Smelyanskiy
S. Pupyrev
K. Hazelwood
Asaf Cidon
Sachin Katti
46
81
0
14 Nov 2018
Deep Interest Evolution Network for Click-Through Rate Prediction
Guorui Zhou
Na Mou
Ying Fan
Qi Pi
Weijie Bian
Chang Zhou
Xiaoqiang Zhu
Kun Gai
81
1,062
0
11 Sep 2018
Beyond Data and Model Parallelism for Deep Neural Networks
Zhihao Jia
Matei A. Zaharia
A. Aiken
GNN
AI4CE
56
505
0
14 Jul 2018
Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks
Zhihao Jia
Sina Lin
C. Qi
A. Aiken
49
117
0
14 Feb 2018
Deep Interest Network for Click-Through Rate Prediction
Guorui Zhou
Cheng-Ning Song
Xiaoqiang Zhu
Xi-Wang Dai
Ziru Xu
Xiao Ma
Yanghui Yan
Junqi Jin
Han Li
Kun Gai
68
1,820
0
21 Jun 2017
In-Datacenter Performance Analysis of a Tensor Processing Unit
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
...
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
233
4,630
0
16 Apr 2017
Wide & Deep Learning for Recommender Systems
Heng-Tze Cheng
L. Koc
Jeremiah Harmsen
T. Shaked
Tushar Chandra
...
Zakaria Haque
Lichan Hong
Vihan Jain
Xiaobing Liu
Hemal Shah
HAI
VLM
166
3,658
0
24 Jun 2016
1