ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.07424
  4. Cited By
Hercules: Heterogeneity-Aware Inference Serving for At-Scale
  Personalized Recommendation

Hercules: Heterogeneity-Aware Inference Serving for At-Scale Personalized Recommendation

14 March 2022
Liu Ke
Udit Gupta
Mark Hempstead
Carole-Jean Wu
Hsien-Hsin S. Lee
Xuan Zhang
ArXivPDFHTML

Papers citing "Hercules: Heterogeneity-Aware Inference Serving for At-Scale Personalized Recommendation"

15 / 15 papers shown
Title
LazyBatching: An SLA-aware Batching System for Cloud Machine Learning
  Inference
LazyBatching: An SLA-aware Batching System for Cloud Machine Learning Inference
Yujeong Choi
Yunseong Kim
Minsoo Rhu
44
66
0
25 Oct 2020
Tensor Casting: Co-Designing Algorithm-Architecture for Personalized
  Recommendation Training
Tensor Casting: Co-Designing Algorithm-Architecture for Personalized Recommendation Training
Youngeun Kwon
Yunjae Lee
Minsoo Rhu
48
40
0
25 Oct 2020
Cross-Stack Workload Characterization of Deep Recommendation Systems
Cross-Stack Workload Characterization of Deep Recommendation Systems
Samuel Hsia
Udit Gupta
Mark Wilkening
Carole-Jean Wu
Gu-Yeon Wei
David Brooks
BDL
GNN
HAI
112
32
0
10 Oct 2020
DeepRecSys: A System for Optimizing End-To-End At-scale Neural
  Recommendation Inference
DeepRecSys: A System for Optimizing End-To-End At-scale Neural Recommendation Inference
Udit Gupta
Samuel Hsia
V. Saraph
Xiaodong Wang
Brandon Reagen
Gu-Yeon Wei
Hsien-Hsin S. Lee
David Brooks
Carole-Jean Wu
GNN
58
189
0
08 Jan 2020
RecNMP: Accelerating Personalized Recommendation with Near-Memory
  Processing
RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing
Liu Ke
Udit Gupta
Carole-Jean Wu
B. Cho
Mark Hempstead
...
Dheevatsa Mudigere
Maxim Naumov
Martin D. Schatz
M. Smelyanskiy
Xiaodong Wang
53
217
0
30 Dec 2019
TensorDIMM: A Practical Near-Memory Processing Architecture for
  Embeddings and Tensor Operations in Deep Learning
TensorDIMM: A Practical Near-Memory Processing Architecture for Embeddings and Tensor Operations in Deep Learning
Youngeun Kwon
Yunjae Lee
Minsoo Rhu
43
210
0
08 Aug 2019
The Architectural Implications of Facebook's DNN-based Personalized
  Recommendation
The Architectural Implications of Facebook's DNN-based Personalized Recommendation
Udit Gupta
Carole-Jean Wu
Xiaodong Wang
Maxim Naumov
Brandon Reagen
...
Andrey Malevich
Dheevatsa Mudigere
M. Smelyanskiy
Liang Xiong
Xuan Zhang
GNN
73
290
0
06 Jun 2019
Deep Learning Recommendation Model for Personalization and
  Recommendation Systems
Deep Learning Recommendation Model for Personalization and Recommendation Systems
Maxim Naumov
Dheevatsa Mudigere
Hao-Jun Michael Shi
Jianyu Huang
Narayanan Sundaraman
...
Wenlin Chen
Vijay Rao
Bill Jia
Liang Xiong
M. Smelyanskiy
85
733
0
31 May 2019
Bandana: Using Non-volatile Memory for Storing Deep Learning Models
Bandana: Using Non-volatile Memory for Storing Deep Learning Models
Assaf Eisenman
Maxim Naumov
Darryl Gardner
M. Smelyanskiy
S. Pupyrev
K. Hazelwood
Asaf Cidon
Sachin Katti
46
81
0
14 Nov 2018
Deep Interest Evolution Network for Click-Through Rate Prediction
Deep Interest Evolution Network for Click-Through Rate Prediction
Guorui Zhou
Na Mou
Ying Fan
Qi Pi
Weijie Bian
Chang Zhou
Xiaoqiang Zhu
Kun Gai
81
1,062
0
11 Sep 2018
Beyond Data and Model Parallelism for Deep Neural Networks
Beyond Data and Model Parallelism for Deep Neural Networks
Zhihao Jia
Matei A. Zaharia
A. Aiken
GNN
AI4CE
56
505
0
14 Jul 2018
Exploring Hidden Dimensions in Parallelizing Convolutional Neural
  Networks
Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks
Zhihao Jia
Sina Lin
C. Qi
A. Aiken
49
117
0
14 Feb 2018
Deep Interest Network for Click-Through Rate Prediction
Deep Interest Network for Click-Through Rate Prediction
Guorui Zhou
Cheng-Ning Song
Xiaoqiang Zhu
Xi-Wang Dai
Ziru Xu
Xiao Ma
Yanghui Yan
Junqi Jin
Han Li
Kun Gai
68
1,820
0
21 Jun 2017
In-Datacenter Performance Analysis of a Tensor Processing Unit
In-Datacenter Performance Analysis of a Tensor Processing Unit
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
...
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
233
4,630
0
16 Apr 2017
Wide & Deep Learning for Recommender Systems
Wide & Deep Learning for Recommender Systems
Heng-Tze Cheng
L. Koc
Jeremiah Harmsen
T. Shaked
Tushar Chandra
...
Zakaria Haque
Lichan Hong
Vihan Jain
Xiaobing Liu
Hemal Shah
HAI
VLM
166
3,658
0
24 Jun 2016
1