Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2201.07821
Cited By
Building a Performance Model for Deep Learning Recommendation Model Training on GPUs
19 January 2022
Zhongyi Lin
Louis Feng
E. K. Ardestani
Jaewon Lee
J. Lundell
Changkyu Kim
A. Kejariwal
John Douglas Owens
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Building a Performance Model for Deep Learning Recommendation Model Training on GPUs"
11 / 11 papers shown
Title
Lumos: Efficient Performance Modeling and Estimation for Large-scale LLM Training
Mingyu Liang
Hiwot Tadese Kassa
Wenyin Fu
Brian Coutinho
Louis Feng
Christina Delimitrou
28
0
0
12 Apr 2025
Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUs
Rishabh Jain
Vivek M. Bhasi
Adwait Jog
A. Sivasubramaniam
M. Kandemir
Chita R. Das
33
2
0
29 Oct 2024
Vidur: A Large-Scale Simulation Framework For LLM Inference
Amey Agrawal
Nitin Kedia
Jayashree Mohan
Ashish Panwar
Nipun Kwatra
Bhargav S. Gulavani
Ramachandran Ramjee
Alexey Tumanov
VLM
40
38
0
08 May 2024
Towards Universal Performance Modeling for Machine Learning Training on Multi-GPU Platforms
Zhongyi Lin
Ning Sun
Pallab Bhattacharya
Xizhou Feng
Louis Feng
John Douglas Owens
42
1
0
19 Apr 2024
A Unified CPU-GPU Protocol for GNN Training
Yi-Chien Lin
Gangda Deng
Viktor Prasanna
GNN
3DH
34
2
0
25 Mar 2024
Review of compressed embedding layers and their applications for recommender systems
Tamás Hajgató
26
0
0
23 Jun 2023
Proteus: Simulating the Performance of Distributed DNN Training
Jiangfei Duan
Xiuhong Li
Ping Xu
Xingcheng Zhang
Shengen Yan
Yun Liang
Dahua Lin
81
10
0
04 Jun 2023
Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces
Srinivas Sridharan
Taekyung Heo
Louis Feng
Zhaodong Wang
M. Bergeron
...
Shengbao Zheng
Brian Coutinho
Saeed Rashidi
Changhai Man
T. Krishna
26
13
0
23 May 2023
The Framework Tax: Disparities Between Inference Efficiency in NLP Research and Deployment
Jared Fernandez
Jacob Kahn
Clara Na
Yonatan Bisk
Emma Strubell
FedML
38
10
0
13 Feb 2023
Mystique: Enabling Accurate and Scalable Generation of Production AI Benchmarks
Mingyu Liang
Wenyin Fu
Louis Feng
Zhongyi Lin
P. Panakanti
Shengbao Zheng
Srinivas Sridharan
Christina Delimitrou
26
12
0
16 Dec 2022
Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems
Weijie Zhao
Deping Xie
Ronglai Jia
Yulei Qian
Rui Ding
Mingming Sun
P. Li
MoE
59
151
0
12 Mar 2020
1