Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.03109
Cited By
The Architectural Implications of Facebook's DNN-based Personalized Recommendation
6 June 2019
Udit Gupta
Carole-Jean Wu
Xiaodong Wang
Maxim Naumov
Brandon Reagen
David Brooks
Bradford Cottel
K. Hazelwood
Bill Jia
Hsien-Hsin S. Lee
Andrey Malevich
Dheevatsa Mudigere
M. Smelyanskiy
Liang Xiong
Xuan Zhang
GNN
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Architectural Implications of Facebook's DNN-based Personalized Recommendation"
50 / 105 papers shown
Title
Ember: A Compiler for Efficient Embedding Operations on Decoupled Access-Execute Architectures
Marco Siracusa
Olivia Hsu
Victor Soria-Pardos
Joshua Randall
Arnaud Grasset
...
Doug Joseph
Randy Allen
Fredrik Kjolstad
Miquel Moretó Planas
Adrià Armejach
31
0
0
14 Apr 2025
SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models
Jinho Yang
Ji-Hoon Kim
Joo-Young Kim
41
0
0
01 Apr 2025
Palermo: Improving the Performance of Oblivious Memory using Protocol-Hardware Co-Design
Haojie Ye
Yuchen Xia
Yuhan Chen
Kuan-Yu Chen
Yichao Yuan
Shuwen Deng
Baris Kasikci
T. Mudge
Nishil Talati
23
0
0
08 Nov 2024
Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUs
Rishabh Jain
Vivek M. Bhasi
Adwait Jog
A. Sivasubramaniam
M. Kandemir
Chita R. Das
28
2
0
29 Oct 2024
DQRM: Deep Quantized Recommendation Models
Yang Zhou
Zhen Dong
Ellick Chan
Dhiraj Kalamkar
Diana Marculescu
Kurt Keutzer
MQ
50
1
0
26 Oct 2024
Characterizing and Efficiently Accelerating Multimodal Generation Model Inference
Yejin Lee
Anna Y. Sun
Basil Hosmer
Bilge Acun
Can Balioglu
...
Ram Pasunuru
Scott Yih
Sravya Popuri
Xing Liu
Carole-Jean Wu
52
2
0
30 Sep 2024
PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System Inferences
Pingyi Huo
Anusha Devulapally
Hasan Al Maruf
Minseo Park
Krishnakumar Nair
Meena Arunachalam
Gulsum Gudukbay Akbulut
M. Kandemir
Vijaykrishnan Narayanan
34
0
0
25 Sep 2024
FedSlate:A Federated Deep Reinforcement Learning Recommender System
Yongxin Deng
Xihe Qiu
Xiaoyu Tan
Yaochu Jin
FedML
93
0
0
23 Sep 2024
Stream-K++: Adaptive GPU GEMM Kernel Scheduling and Selection using Bloom Filters
Harisankar Sadasivan
Muhammad Osama
Maksim Podkorytov
Carlus Huang
Jun Liu
20
0
0
21 Aug 2024
UpDLRM: Accelerating Personalized Recommendation using Real-World PIM Architecture
Sitian Chen
Haobin Tan
Amelie Chi Zhou
Yusen Li
Pavan Balaji
AI4CE
20
2
0
20 Jun 2024
PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models
Yunjae Lee
Hyeseong Kim
Minsoo Rhu
34
3
0
11 Jun 2024
ElasticRec: A Microservice-based Model Serving Architecture Enabling Elastic Resource Scaling for Recommendation Models
Yujeong Choi
Jiin Kim
Minsoo Rhu
37
1
0
11 Jun 2024
Beyond Efficiency: Scaling AI Sustainably
Carole-Jean Wu
Bilge Acun
Ramya Raghavendra
Kim Hazelwood
GNN
41
14
0
08 Jun 2024
LazyDP: Co-Designing Algorithm-Software for Scalable Training of Differentially Private Recommendation Models
Juntaek Lim
Youngeun Kwon
Ranggi Hwang
Kiwan Maeng
Edward Suh
Minsoo Rhu
SyDa
31
0
0
12 Apr 2024
Accelerating Distributed Deep Learning using Lossless Homomorphic Compression
Haoyu Li
Yuchen Xu
Jiayi Chen
Rohit Dwivedula
Wenfei Wu
Keqiang He
Aditya Akella
Daehyeok Kim
FedML
AI4CE
19
4
0
12 Feb 2024
ACCL+: an FPGA-Based Collective Engine for Distributed Applications
Zhenhao He
Dario Korolija
Yu Zhu
Benjamin Ramhorst
Tristan Laan
L. Petrica
Michaela Blott
Gustavo Alonso
GNN
21
6
0
18 Dec 2023
CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation Models
Hailin Zhang
Zirui Liu
Boxuan Chen
Yikai Zhao
Tong Zhao
Tong Yang
Bin Cui
32
10
0
06 Dec 2023
MOSEL: Inference Serving Using Dynamic Modality Selection
Bodun Hu
Le Xu
Jeongyoon Moon
N. Yadwadkar
Aditya Akella
11
4
0
27 Oct 2023
MAD Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems
Samuel Hsia
Alicia Golden
Bilge Acun
Newsha Ardalani
Zach DeVito
Gu-Yeon Wei
David Brooks
Carole-Jean Wu
MoE
43
9
0
04 Oct 2023
Ad-Rec: Advanced Feature Interactions to Address Covariate-Shifts in Recommendation Networks
Muhammad Adnan
Yassaman Ebrahimzadeh Maboud
Divyat Mahajan
Prashant J. Nair
32
3
0
28 Aug 2023
Throughput Maximization of DNN Inference: Batching or Multi-Tenancy?
Seyed Morteza Nabavinejad
M. Ebrahimi
Sherief Reda
14
1
0
26 Aug 2023
InTune: Reinforcement Learning-based Data Pipeline Optimization for Deep Recommendation Models
Kabir Nagrecha
Lingyi Liu
P. Delgado
Prasanna Padmanabhan
OffRL
AI4CE
25
5
0
13 Aug 2023
Evaluating and Enhancing Robustness of Deep Recommendation Systems Against Hardware Errors
Dongning Ma
Xun Jiao
Fred Lin
Mengshi Zhang
Alban Desmaison
Thomas Sellinger
Daniel Moore
Sriram Sankar
24
2
0
17 Jul 2023
Pareto-Secure Machine Learning (PSML): Fingerprinting and Securing Inference Serving Systems
Debopam Sanyal
Jui-Tse Hung
Manavi Agrawal
Prahlad Jasti
Shahab Nikkhoo
S. Jha
Tianhao Wang
Sibin Mohan
Alexey Tumanov
42
0
0
03 Jul 2023
Mem-Rec: Memory Efficient Recommendation System using Alternative Representation
Gopu Krishna Jha
Anthony Thomas
Nilesh Jain
Sameh Gobriel
Tajana Rosing
Ravi Iyer
45
2
0
12 May 2023
Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models
Daochen Zha
Louis Feng
Liangchen Luo
Bhargav Bhushanam
Zirui Liu
...
J. McMahon
Yuzhen Huang
Bryan Clarke
A. Kejariwal
Xia Hu
50
7
0
03 May 2023
BOLT: An Automated Deep Learning Framework for Training and Deploying Large-Scale Search and Recommendation Models on Commodity CPU Hardware
Nicholas Meisburger
V. Lakshman
Benito Geordie
Joshua Engels
David Torres Ramos
...
Benjamin Meisburger
Shubh Gupta
Yashwanth Adunukota
Tharun Medini
Anshumali Shrivastava
17
2
0
30 Mar 2023
Hera: A Heterogeneity-Aware Multi-Tenant Inference Server for Personalized Recommendations
Yujeong Choi
John Kim
Minsoo Rhu
13
1
0
23 Feb 2023
MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation
Samuel Hsia
Udit Gupta
Bilge Acun
Newsha Ardalani
Pan Zhong
Gu-Yeon Wei
David Brooks
Carole-Jean Wu
41
17
0
21 Feb 2023
GPU-based Private Information Retrieval for On-Device Machine Learning Inference
Maximilian Lam
Jeff Johnson
Wenjie Xiong
Kiwan Maeng
Udit Gupta
...
Hsien-Hsin S. Lee
Vijay Janapa Reddi
Gu-Yeon Wei
David Brooks
Edward Suh
24
9
0
26 Jan 2023
AttMEMO : Accelerating Transformers with Memoization on Big Memory Systems
Yuan Feng
Hyeran Jeon
F. Blagojevic
Cyril Guyot
Qing Li
Dong Li
GNN
19
3
0
23 Jan 2023
FlexShard: Flexible Sharding for Industry-Scale Sequence Recommendation Models
Geet Sethi
Pallab Bhattacharya
Dhruv Choudhary
Carole-Jean Wu
Christos Kozyrakis
18
5
0
08 Jan 2023
Mystique: Enabling Accurate and Scalable Generation of Production AI Benchmarks
Mingyu Liang
Wenyin Fu
Louis Feng
Zhongyi Lin
P. Panakanti
Shengbao Zheng
Srinivas Sridharan
Christina Delimitrou
21
12
0
16 Dec 2022
Data Leakage via Access Patterns of Sparse Features in Deep Learning-based Recommendation Systems
H. Hashemi
Wenjie Xiong
Liu Ke
Kiwan Maeng
M. Annavaram
G. E. Suh
Hsien-Hsin S. Lee
32
6
0
12 Dec 2022
DisaggRec: Architecting Disaggregated Systems for Large-Scale Personalized Recommendation
Liu Ke
Xuan Zhang
Benjamin C. Lee
G. E. Suh
Hsien-Hsin S. Lee
41
8
0
02 Dec 2022
COMET: A Comprehensive Cluster Design Methodology for Distributed Deep Learning Training
D. Kadiyala
Saeed Rashidi
Taekyung Heo
Abhimanyu Bambhaniya
T. Krishna
Alexandros Daglis
VLM
24
9
0
30 Nov 2022
KAIROS: Building Cost-Efficient Machine Learning Inference Systems with Heterogeneous Cloud Resources
Baolin Li
S. Samsi
V. Gadepally
Devesh Tiwari
22
11
0
12 Oct 2022
DreamShard: Generalizable Embedding Table Placement for Recommender Systems
Daochen Zha
Louis Feng
Qiaoyu Tan
Zirui Liu
Kwei-Herng Lai
Bhargav Bhushanam
Yuandong Tian
A. Kejariwal
Xia Hu
LMTD
OffRL
20
28
0
05 Oct 2022
Accelerating Neural Network Inference with Processing-in-DRAM: From the Edge to the Cloud
Geraldo F. Oliveira
Juan Gómez Luna
Saugata Ghose
Amirali Boroumand
O. Mutlu
21
23
0
19 Sep 2022
MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms
Yuke Wang
Boyuan Feng
Zheng Wang
Tong Geng
Kevin J. Barker
Ang Li
Yufei Ding
GNN
45
25
0
14 Sep 2022
HammingMesh: A Network Topology for Large-Scale Deep Learning
Torsten Hoefler
Tommaso Bonato
Daniele De Sensi
Salvatore Di Girolamo
Shigang Li
Marco Heddes
Jon Belk
Deepak Goel
Miguel Castro
Steve Scott
3DH
GNN
AI4CE
26
20
0
03 Sep 2022
AutoShard: Automated Embedding Table Sharding for Recommender Systems
Daochen Zha
Louis Feng
Bhargav Bhushanam
Dhruv Choudhary
Jade Nie
Yuandong Tian
Jay Chae
Yi-An Ma
A. Kejariwal
Xia Hu
35
30
0
12 Aug 2022
Zeus: Understanding and Optimizing GPU Energy Consumption of DNN Training
Jie You
Jaehoon Chung
Mosharaf Chowdhury
23
75
0
12 Aug 2022
RIBBON: Cost-Effective and QoS-Aware Deep Learning Model Inference using a Diverse Pool of Cloud Computing Instances
Baolin Li
Rohan Basu Roy
Tirthak Patel
V. Gadepally
K. Gettings
Devesh Tiwari
27
25
0
23 Jul 2022
MISO: Exploiting Multi-Instance GPU Capability on Multi-Tenant Systems for Machine Learning
Baolin Li
Tirthak Patel
S. Samsi
V. Gadepally
Devesh Tiwari
12
51
0
23 Jul 2022
The trade-offs of model size in large recommendation models : A 10000
×
\times
×
compressed criteo-tb DLRM model (100 GB parameters to mere 10MB)
Aditya Desai
Anshumali Shrivastava
AI4CE
15
3
0
21 Jul 2022
Low-latency Mini-batch GNN Inference on CPU-FPGA Heterogeneous Platform
Bingyi Zhang
Hanqing Zeng
Viktor Prasanna
GNN
24
12
0
17 Jun 2022
Towards Fair Federated Recommendation Learning: Characterizing the Inter-Dependence of System and Data Heterogeneity
Kiwan Maeng
Haiyu Lu
Luca Melis
John Nguyen
Michael G. Rabbat
Carole-Jean Wu
FedML
29
31
0
30 May 2022
Training Personalized Recommendation Systems from (GPU) Scratch: Look Forward not Backwards
Youngeun Kwon
Minsoo Rhu
21
27
0
10 May 2022
Heterogeneous Acceleration Pipeline for Recommendation System Training
Muhammad Adnan
Yassaman Ebrahimzadeh Maboud
Divyat Mahajan
Prashant J. Nair
26
18
0
11 Apr 2022
1
2
3
Next