ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.08985
  4. Cited By
Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN
  Inference

Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN Inference

21 July 2019
Weiwen Jiang
E. Sha
Xinyi Zhang
Lei Yang
Qingfeng Zhuge
Yiyu Shi
Jiaxi Hu
ArXivPDFHTML

Papers citing "Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN Inference"

16 / 16 papers shown
Title
Embedded Distributed Inference of Deep Neural Networks: A Systematic
  Review
Embedded Distributed Inference of Deep Neural Networks: A Systematic Review
Federico Nicolás Peccia
Oliver Bringmann
41
0
0
06 May 2024
SGPRS: Seamless GPU Partitioning Real-Time Scheduler for Periodic Deep
  Learning Workloads
SGPRS: Seamless GPU Partitioning Real-Time Scheduler for Periodic Deep Learning Workloads
Amir Fakhim Babaei
Thidapat Chantem
14
2
0
13 Apr 2024
TAPA-CS: Enabling Scalable Accelerator Design on Distributed HBM-FPGAs
TAPA-CS: Enabling Scalable Accelerator Design on Distributed HBM-FPGAs
Neha Prakriya
Yuze Chi
Suhail Basalama
Linghao Song
Jason Cong
40
1
0
16 Nov 2023
On-Device Unsupervised Image Segmentation
On-Device Unsupervised Image Segmentation
Junhuan Yang
Yi Sheng
Yu-zhao Zhang
Weiwen Jiang
Lei Yang
35
12
0
24 Feb 2023
A Semi-Decoupled Approach to Fast and Optimal Hardware-Software
  Co-Design of Neural Accelerators
A Semi-Decoupled Approach to Fast and Optimal Hardware-Software Co-Design of Neural Accelerators
Bingqian Lu
Zheyu Yan
Yiyu Shi
Shaolei Ren
26
2
0
25 Mar 2022
EF-Train: Enable Efficient On-device CNN Training on FPGA Through Data
  Reshaping for Online Adaptation or Personalization
EF-Train: Enable Efficient On-device CNN Training on FPGA Through Data Reshaping for Online Adaptation or Personalization
Yue Tang
Xinyi Zhang
Peipei Zhou
Jingtong Hu
21
17
0
18 Feb 2022
Accelerating Framework of Transformer by Hardware Design and Model
  Compression Co-Optimization
Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization
Panjie Qi
E. Sha
Qingfeng Zhuge
Hongwu Peng
Shaoyi Huang
Zhenglun Kong
Yuhong Song
Bingbing Li
11
50
0
19 Oct 2021
Can Noise on Qubits Be Learned in Quantum Neural Network? A Case Study
  on QuantumFlow
Can Noise on Qubits Be Learned in Quantum Neural Network? A Case Study on QuantumFlow
Zhiding Liang
Zhepeng Wang
Junhuan Yang
Lei Yang
Jinjun Xiong
Y. Shi
Weiwen Jiang
38
37
0
08 Sep 2021
Enabling OpenMP Task Parallelism on Multi-FPGAs
Enabling OpenMP Task Parallelism on Multi-FPGAs
Ramon Nepomuceno
Renan Sterle
G. Valarini
M. Pereira
H. Yviquel
Guido Araujo
11
2
0
19 Mar 2021
Dancing along Battery: Enabling Transformer with Run-time
  Reconfigurability on Mobile Devices
Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile Devices
Yuhong Song
Weiwen Jiang
Bingbing Li
Panjie Qi
Qingfeng Zhuge
E. Sha
Sakyasingha Dasgupta
Yiyu Shi
Caiwen Ding
18
18
0
12 Feb 2021
ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators
  using Reinforcement Learning
ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement Learning
Sheng-Chun Kao
Geonhwa Jeong
T. Krishna
28
95
0
04 Sep 2020
Standing on the Shoulders of Giants: Hardware and Neural Architecture
  Co-Search with Hot Start
Standing on the Shoulders of Giants: Hardware and Neural Architecture Co-Search with Hot Start
Weiwen Jiang
Lei Yang
Sakyasingha Dasgupta
Jiaxi Hu
Yiyu Shi
27
59
0
17 Jul 2020
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML
  Models: A Survey and Insights
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights
Shail Dave
Riyadh Baghdadi
Tony Nowatzki
Sasikanth Avancha
Aviral Shrivastava
Baoxin Li
64
82
0
02 Jul 2020
Co-Exploration of Neural Architectures and Heterogeneous ASIC
  Accelerator Designs Targeting Multiple Tasks
Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks
Lei Yang
Zheyu Yan
Meng Li
Hyoukjun Kwon
Liangzhen Lai
T. Krishna
Vikas Chandra
Weiwen Jiang
Yiyu Shi
32
115
0
10 Feb 2020
Device-Circuit-Architecture Co-Exploration for Computing-in-Memory
  Neural Accelerators
Device-Circuit-Architecture Co-Exploration for Computing-in-Memory Neural Accelerators
Weiwen Jiang
Qiuwen Lou
Zheyu Yan
Lei Yang
Jiaxi Hu
X. S. Hu
Yiyu Shi
16
72
0
31 Oct 2019
When Single Event Upset Meets Deep Neural Networks: Observations,
  Explorations, and Remedies
When Single Event Upset Meets Deep Neural Networks: Observations, Explorations, and Remedies
Zheyu Yan
Yiyu Shi
Wang Liao
M. Hashimoto
Xichuan Zhou
Cheng Zhuo
AAML
14
48
0
10 Sep 2019
1