ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2208.05130
  4. Cited By
PROFET: Profiling-based CNN Training Latency Prophet for GPU Cloud
  Instances
v1v2 (latest)

PROFET: Profiling-based CNN Training Latency Prophet for GPU Cloud Instances

10 August 2022
Sungjae Lee
Y. Hur
Subin Park
Kyungyong Lee
ArXiv (abs)PDFHTML

Papers citing "PROFET: Profiling-based CNN Training Latency Prophet for GPU Cloud Instances"

31 / 31 papers shown
Title
RIBBON: Cost-Effective and QoS-Aware Deep Learning Model Inference using
  a Diverse Pool of Cloud Computing Instances
RIBBON: Cost-Effective and QoS-Aware Deep Learning Model Inference using a Diverse Pool of Cloud Computing Instances
Baolin Li
Rohan Basu Roy
Tirthak Patel
V. Gadepally
K. Gettings
Devesh Tiwari
55
25
0
23 Jul 2022
Characterization and Prediction of Deep Learning Workloads in
  Large-Scale GPU Datacenters
Characterization and Prediction of Deep Learning Workloads in Large-Scale GPU Datacenters
Qi Hu
Peng Sun
Shengen Yan
Yonggang Wen
Tianwei Zhang
3DHGNN
82
133
0
03 Sep 2021
An Oracle for Guiding Large-Scale Model/Hybrid Parallel Training of
  Convolutional Neural Networks
An Oracle for Guiding Large-Scale Model/Hybrid Parallel Training of Convolutional Neural Networks
A. Kahira
Truong Thao Nguyen
L. Bautista-Gomez
Ryousei Takano
Rosa M. Badia
Mohamed Wahib
52
11
0
19 Apr 2021
A Runtime-Based Computational Performance Predictor for Deep Neural
  Network Training
A Runtime-Based Computational Performance Predictor for Deep Neural Network Training
Geoffrey X. Yu
Yubo Gao
P. Golikov
Gennady Pekhimenko
3DH
69
68
0
31 Jan 2021
Toward Accurate Platform-Aware Performance Modeling for Deep Neural
  Networks
Toward Accurate Platform-Aware Performance Modeling for Deep Neural Networks
Chuan-Chi Wang
Ying-Chiao Liao
Ming-Chang Kao
Wen-Yew Liang
Shih-Hao Hung
28
11
0
01 Dec 2020
Predicting Training Time Without Training
Predicting Training Time Without Training
Luca Zancato
Alessandro Achille
Avinash Ravichandran
Rahul Bhotika
Stefano Soatto
137
24
0
28 Aug 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
568
42,677
0
03 Dec 2019
Characterizing Deep Learning Training Workloads on Alibaba-PAI
Characterizing Deep Learning Training Workloads on Alibaba-PAI
Mengdi Wang
Chen Meng
Guoping Long
Chuan Wu
Jun Yang
Wei Lin
Yangqing Jia
65
55
0
14 Oct 2019
Once-for-All: Train One Network and Specialize it for Efficient
  Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deployment
Han Cai
Chuang Gan
Tianzhe Wang
Zhekai Zhang
Song Han
OOD
126
1,283
0
26 Aug 2019
Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training
  Workloads
Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training Workloads
Myeongjae Jeon
Shivaram Venkataraman
Amar Phanishayee
Junjie Qian
Wencong Xiao
Fan Yang
GNN
74
354
0
17 Jan 2019
Serverless Computing: One Step Forward, Two Steps Back
Serverless Computing: One Step Forward, Two Steps Back
J. M. Hellerstein
Jose M. Faleiro
Joseph E. Gonzalez
Johann Schleier-Smith
Vikram Sreekanti
Alexey Tumanov
Chenggang Wu
57
392
0
10 Dec 2018
Predicting the Computational Cost of Deep Learning Models
Predicting the Computational Cost of Deep Learning Models
Daniel Justus
John Brennan
Stephen Bonner
A. Mcgough
46
230
0
28 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,324
0
11 Oct 2018
MobileNetV2: Inverted Residuals and Linear Bottlenecks
MobileNetV2: Inverted Residuals and Linear Bottlenecks
Mark Sandler
Andrew G. Howard
Menglong Zhu
A. Zhmoginov
Liang-Chieh Chen
226
19,353
0
13 Jan 2018
NeuralPower: Predict and Deploy Energy-Efficient Convolutional Neural
  Networks
NeuralPower: Predict and Deploy Energy-Efficient Convolutional Neural Networks
E. Cai
Da-Cheng Juan
Dimitrios Stamoulis
Diana Marculescu
51
132
0
15 Oct 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
808
132,725
0
12 Jun 2017
In-Datacenter Performance Analysis of a Tensor Processing Unit
In-Datacenter Performance Analysis of a Tensor Processing Unit
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
...
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
239
4,648
0
16 Apr 2017
Speed/accuracy trade-offs for modern convolutional object detectors
Speed/accuracy trade-offs for modern convolutional object detectors
Jonathan Huang
V. Rathod
Chen Sun
Menglong Zhu
Anoop Korattikara Balan
...
Ian S. Fischer
Z. Wojna
Yang Song
S. Guadarrama
Kevin Patrick Murphy
3DH3DV
108
2,573
0
30 Nov 2016
Neural Architecture Search with Reinforcement Learning
Neural Architecture Search with Reinforcement Learning
Barret Zoph
Quoc V. Le
487
5,385
0
05 Nov 2016
TensorFlow: A system for large-scale machine learning
TensorFlow: A system for large-scale machine learning
Martín Abadi
P. Barham
Jianmin Chen
Zhiwen Chen
Andy Davis
...
Vijay Vasudevan
Pete Warden
Martin Wicke
Yuan Yu
Xiaoqiang Zhang
GNNAI4CE
435
18,361
0
27 May 2016
Theano: A Python framework for fast computation of mathematical
  expressions
Theano: A Python framework for fast computation of mathematical expressions
The Theano Development Team
Rami Al-Rfou
Guillaume Alain
Amjad Almahairi
Christof Angermüller
...
Kelvin Xu
Lijun Xue
Li Yao
Saizheng Zhang
Ying Zhang
208
2,340
0
09 May 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.3K
194,641
0
10 Dec 2015
MXNet: A Flexible and Efficient Machine Learning Library for
  Heterogeneous Distributed Systems
MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
Tianqi Chen
Mu Li
Yutian Li
Min Lin
Naiyan Wang
Minjie Wang
Tianjun Xiao
Bing Xu
Chiyuan Zhang
Zheng Zhang
200
2,248
0
03 Dec 2015
Rethinking the Inception Architecture for Computer Vision
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DVBDL
886
27,444
0
02 Dec 2015
Empirical Evaluation of Rectified Activations in Convolutional Network
Empirical Evaluation of Rectified Activations in Convolutional Network
Bing Xu
Naiyan Wang
Tianqi Chen
Mu Li
147
2,916
0
05 May 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
2.1K
150,433
0
22 Dec 2014
Going Deeper with Convolutions
Going Deeper with Convolutions
Christian Szegedy
Wei Liu
Yangqing Jia
P. Sermanet
Scott E. Reed
Dragomir Anguelov
D. Erhan
Vincent Vanhoucke
Andrew Rabinovich
496
43,717
0
17 Sep 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAttMDE
1.7K
100,575
0
04 Sep 2014
Return of the Devil in the Details: Delving Deep into Convolutional Nets
Return of the Devil in the Details: Delving Deep into Convolutional Nets
Ken Chatfield
Karen Simonyan
Andrea Vedaldi
Andrew Zisserman
FAtt
226
3,420
0
14 May 2014
Deep Learning in Neural Networks: An Overview
Deep Learning in Neural Networks: An Overview
Jürgen Schmidhuber
HAI
250
16,405
0
30 Apr 2014
Practical Bayesian Optimization of Machine Learning Algorithms
Practical Bayesian Optimization of Machine Learning Algorithms
Jasper Snoek
Hugo Larochelle
Ryan P. Adams
382
7,981
0
13 Jun 2012
1