ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.06955
  4. Cited By
ElasticRec: A Microservice-based Model Serving Architecture Enabling
  Elastic Resource Scaling for Recommendation Models

ElasticRec: A Microservice-based Model Serving Architecture Enabling Elastic Resource Scaling for Recommendation Models

11 June 2024
Yujeong Choi
Jiin Kim
Minsoo Rhu
ArXivPDFHTML

Papers citing "ElasticRec: A Microservice-based Model Serving Architecture Enabling Elastic Resource Scaling for Recommendation Models"

27 / 27 papers shown
Title
TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning
  with Hardware Support for Embeddings
TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings
N. Jouppi
George Kurian
Sheng Li
Peter C. Ma
R. Nagarajan
...
Brian Towles
C. Young
Xiaoping Zhou
Zongwei Zhou
David A. Patterson
BDL
VLM
106
355
0
04 Apr 2023
MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation
MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation
Samuel Hsia
Udit Gupta
Bilge Acun
Newsha Ardalani
Pan Zhong
Gu-Yeon Wei
David Brooks
Carole-Jean Wu
78
17
0
21 Feb 2023
DisaggRec: Architecting Disaggregated Systems for Large-Scale
  Personalized Recommendation
DisaggRec: Architecting Disaggregated Systems for Large-Scale Personalized Recommendation
Liu Ke
Xuan Zhang
Benjamin C. Lee
G. E. Suh
Hsien-Hsin S. Lee
48
8
0
02 Dec 2022
Merlin HugeCTR: GPU-accelerated Recommender System Training and
  Inference
Merlin HugeCTR: GPU-accelerated Recommender System Training and Inference
Zehuan Wang
Yingcan Wei
Minseok Lee
Matthias Langer
F. Yu
...
Daniel G. Abel
Xu Guo
Jianbing Dong
Ji Shi
Kunlun Li
GNN
LRM
27
32
0
17 Oct 2022
Training Personalized Recommendation Systems from (GPU) Scratch: Look
  Forward not Backwards
Training Personalized Recommendation Systems from (GPU) Scratch: Look Forward not Backwards
Youngeun Kwon
Minsoo Rhu
44
27
0
10 May 2022
Hercules: Heterogeneity-Aware Inference Serving for At-Scale
  Personalized Recommendation
Hercules: Heterogeneity-Aware Inference Serving for At-Scale Personalized Recommendation
Liu Ke
Udit Gupta
Mark Hempstead
Carole-Jean Wu
Hsien-Hsin S. Lee
Xuan Zhang
33
21
0
14 Mar 2022
RecShard: Statistical Feature-Based Memory Optimization for
  Industry-Scale Neural Recommendation
RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation
Geet Sethi
Bilge Acun
Niket Agarwal
Christos Kozyrakis
Caroline Trippel
Carole-Jean Wu
71
67
0
25 Jan 2022
HET: Scaling out Huge Embedding Model Training via Cache-enabled
  Distributed Framework
HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework
Xupeng Miao
Hailin Zhang
Yining Shi
Xiaonan Nie
Zhi-Xin Yang
Yangyu Tao
Tengjiao Wang
47
57
0
14 Dec 2021
Supporting Massive DLRM Inference Through Software Defined Memory
Supporting Massive DLRM Inference Through Software Defined Memory
E. K. Ardestani
Changkyu Kim
Seung Jae Lee
Luoshang Pan
Valmiki Rampersad
...
Krishnakumar Nair
Maxim Naumov
Christopher Peterson
M. Smelyanskiy
Vijay Rao
BDL
64
20
0
21 Oct 2021
JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale
  Online Inference at Baidu
JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu
Hao Liu
Qian Gao
Jiang Li
X. Liao
Hao Xiong
...
Guobao Yang
Zhiwei Zha
Daxiang Dong
Dejing Dou
Haoyi Xiong
VLM
50
22
0
03 Jun 2021
RecPipe: Co-designing Models and Hardware to Jointly Optimize
  Recommendation Quality and Performance
RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance
Udit Gupta
Samuel Hsia
J. Zhang
Mark Wilkening
Javin Pombra
Hsien-Hsin S. Lee
Gu-Yeon Wei
Carole-Jean Wu
David Brooks
51
32
0
18 May 2021
Software-Hardware Co-design for Fast and Scalable Training of Deep
  Learning Recommendation Models
Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models
Dheevatsa Mudigere
Y. Hao
Jianyu Huang
Zhihao Jia
Andrew Tulloch
...
Ajit Mathews
Lin Qiao
M. Smelyanskiy
Bill Jia
Vijay Rao
72
152
0
12 Apr 2021
Accelerating Recommendation System Training by Leveraging Popular
  Choices
Accelerating Recommendation System Training by Leveraging Popular Choices
Muhammad Adnan
Yassaman Ebrahimzadeh Maboud
Divyat Mahajan
Prashant J. Nair
41
57
0
01 Mar 2021
RecSSD: Near Data Processing for Solid State Drive Based Recommendation
  Inference
RecSSD: Near Data Processing for Solid State Drive Based Recommendation Inference
Mark Wilkening
Udit Gupta
Samuel Hsia
Caroline Trippel
Carole-Jean Wu
David Brooks
Gu-Yeon Wei
40
114
0
29 Jan 2021
TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
Chunxing Yin
Bilge Acun
Xing Liu
Carole-Jean Wu
73
104
0
25 Jan 2021
Understanding Training Efficiency of Deep Learning Recommendation Models
  at Scale
Understanding Training Efficiency of Deep Learning Recommendation Models at Scale
Bilge Acun
Matthew Murphy
Xiaodong Wang
Jade Nie
Carole-Jean Wu
K. Hazelwood
65
112
0
11 Nov 2020
Understanding Capacity-Driven Scale-Out Neural Recommendation Inference
Understanding Capacity-Driven Scale-Out Neural Recommendation Inference
Michael Lui
Yavuz Yetim
Özgür Özkan
Zhuoran Zhao
Shin-Yeh Tsai
Carole-Jean Wu
Mark Hempstead
GNN
BDL
LRM
48
51
0
04 Nov 2020
Tensor Casting: Co-Designing Algorithm-Architecture for Personalized
  Recommendation Training
Tensor Casting: Co-Designing Algorithm-Architecture for Personalized Recommendation Training
Youngeun Kwon
Yunjae Lee
Minsoo Rhu
41
40
0
25 Oct 2020
DeepRecSys: A System for Optimizing End-To-End At-scale Neural
  Recommendation Inference
DeepRecSys: A System for Optimizing End-To-End At-scale Neural Recommendation Inference
Udit Gupta
Samuel Hsia
V. Saraph
Xiaodong Wang
Brandon Reagen
Gu-Yeon Wei
Hsien-Hsin S. Lee
David Brooks
Carole-Jean Wu
GNN
58
188
0
08 Jan 2020
RecNMP: Accelerating Personalized Recommendation with Near-Memory
  Processing
RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing
Liu Ke
Udit Gupta
Carole-Jean Wu
B. Cho
Mark Hempstead
...
Dheevatsa Mudigere
Maxim Naumov
Martin D. Schatz
M. Smelyanskiy
Xiaodong Wang
53
217
0
30 Dec 2019
TensorDIMM: A Practical Near-Memory Processing Architecture for
  Embeddings and Tensor Operations in Deep Learning
TensorDIMM: A Practical Near-Memory Processing Architecture for Embeddings and Tensor Operations in Deep Learning
Youngeun Kwon
Yunjae Lee
Minsoo Rhu
43
210
0
08 Aug 2019
The Architectural Implications of Facebook's DNN-based Personalized
  Recommendation
The Architectural Implications of Facebook's DNN-based Personalized Recommendation
Udit Gupta
Carole-Jean Wu
Xiaodong Wang
Maxim Naumov
Brandon Reagen
...
Andrey Malevich
Dheevatsa Mudigere
M. Smelyanskiy
Liang Xiong
Xuan Zhang
GNN
73
290
0
06 Jun 2019
Deep Learning Recommendation Model for Personalization and
  Recommendation Systems
Deep Learning Recommendation Model for Personalization and Recommendation Systems
Maxim Naumov
Dheevatsa Mudigere
Hao-Jun Michael Shi
Jianyu Huang
Narayanan Sundaraman
...
Wenlin Chen
Vijay Rao
Bill Jia
Liang Xiong
M. Smelyanskiy
85
732
0
31 May 2019
Deep Learning Inference in Facebook Data Centers: Characterization,
  Performance Optimizations and Hardware Implications
Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications
Jongsoo Park
Maxim Naumov
Protonu Basu
Summer Deng
Aravind Kalaiah
...
Lin Qiao
Vijay Rao
Nadav Rotem
S. Yoo
M. Smelyanskiy
FedML
GNN
BDL
57
187
0
24 Nov 2018
Bandana: Using Non-volatile Memory for Storing Deep Learning Models
Bandana: Using Non-volatile Memory for Storing Deep Learning Models
Assaf Eisenman
Maxim Naumov
Darryl Gardner
M. Smelyanskiy
S. Pupyrev
K. Hazelwood
Asaf Cidon
Sachin Katti
46
81
0
14 Nov 2018
TensorFlow-Serving: Flexible, High-Performance ML Serving
TensorFlow-Serving: Flexible, High-Performance ML Serving
Christopher Olston
Noah Fiedel
Kiril Gorovoy
Jeremiah Harmsen
Li Lao
Fangwei Li
Vinu Rajashekhar
Sukriti Ramesh
Jordan Soyke
51
306
0
17 Dec 2017
Wide & Deep Learning for Recommender Systems
Wide & Deep Learning for Recommender Systems
Heng-Tze Cheng
L. Koc
Jeremiah Harmsen
T. Shaked
Tushar Chandra
...
Zakaria Haque
Lichan Hong
Vihan Jain
Xiaobing Liu
Hemal Shah
HAI
VLM
149
3,655
0
24 Jun 2016
1