ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.02067
  4. Cited By
HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array

HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array

7 January 2019
Linghao Song
Jiachen Mao
Youwei Zhuo
Xuehai Qian
Hai Helen Li
Yiran Chen
ArXivPDFHTML

Papers citing "HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array"

11 / 11 papers shown
Title
HAP: SPMD DNN Training on Heterogeneous GPU Clusters with Automated
  Program Synthesis
HAP: SPMD DNN Training on Heterogeneous GPU Clusters with Automated Program Synthesis
Shiwei Zhang
Lansong Diao
Chuan Wu
Zongyan Cao
Siyu Wang
Wei Lin
43
12
0
11 Jan 2024
DEAP: Design Space Exploration for DNN Accelerator Parallelism
DEAP: Design Space Exploration for DNN Accelerator Parallelism
Ekansh Agrawal
Xiangyu Sam Xu
29
1
0
24 Dec 2023
A Survey From Distributed Machine Learning to Distributed Deep Learning
A Survey From Distributed Machine Learning to Distributed Deep Learning
Mohammad Dehghani
Zahra Yazdanparast
26
0
0
11 Jul 2023
Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware
  Communication Compression
Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression
Jaeyong Song
Jinkyu Yim
Jaewon Jung
Hongsun Jang
H. Kim
Youngsok Kim
Jinho Lee
GNN
24
25
0
24 Jan 2023
Demystifying Map Space Exploration for NPUs
Demystifying Map Space Exploration for NPUs
Sheng-Chun Kao
A. Parashar
Po-An Tsai
T. Krishna
38
11
0
07 Oct 2022
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and
  Algorithm Co-design
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Hongxiang Fan
Thomas C. P. Chau
Stylianos I. Venieris
Royson Lee
Alexandros Kouris
Wayne Luk
Nicholas D. Lane
Mohamed S. Abdelfattah
40
58
0
20 Sep 2022
Layer-Wise Partitioning and Merging for Efficient and Scalable Deep
  Learning
Layer-Wise Partitioning and Merging for Efficient and Scalable Deep Learning
S. Akintoye
Liangxiu Han
H. Lloyd
Xin Zhang
Darren Dancey
Haoming Chen
Daoqiang Zhang
FedML
34
5
0
22 Jul 2022
Special Session: Towards an Agile Design Methodology for Efficient,
  Reliable, and Secure ML Systems
Special Session: Towards an Agile Design Methodology for Efficient, Reliable, and Secure ML Systems
Shail Dave
Alberto Marchisio
Muhammad Abdullah Hanif
Amira Guesmi
Aviral Shrivastava
Ihsen Alouani
Muhammad Shafique
34
13
0
18 Apr 2022
FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks
FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks
Sheng-Chun Kao
Suvinay Subramanian
Gaurav Agrawal
Amir Yazdanbakhsh
T. Krishna
38
57
0
13 Jul 2021
FPRaker: A Processing Element For Accelerating Neural Network Training
FPRaker: A Processing Element For Accelerating Neural Network Training
Omar Mohamed Awad
Mostafa Mahmoud
Isak Edo Vivancos
Ali Hadi Zadeh
Ciaran Bannon
Anand Jayarajan
Gennady Pekhimenko
Andreas Moshovos
25
15
0
15 Oct 2020
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML
  Models: A Survey and Insights
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights
Shail Dave
Riyadh Baghdadi
Tony Nowatzki
Sasikanth Avancha
Aviral Shrivastava
Baoxin Li
59
82
0
02 Jul 2020
1