ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1807.04188
  4. Cited By
A Hardware-Software Blueprint for Flexible Deep Learning Specialization

A Hardware-Software Blueprint for Flexible Deep Learning Specialization

11 July 2018
T. Moreau
Tianqi Chen
Luis Vega
Jared Roesch
Eddie Q. Yan
Lianmin Zheng
Josh Fromm
Ziheng Jiang
Luis Ceze
Carlos Guestrin
Arvind Krishnamurthy
ArXivPDFHTML

Papers citing "A Hardware-Software Blueprint for Flexible Deep Learning Specialization"

15 / 15 papers shown
Title
Efficient Edge AI: Deploying Convolutional Neural Networks on FPGA with
  the Gemmini Accelerator
Efficient Edge AI: Deploying Convolutional Neural Networks on FPGA with the Gemmini Accelerator
Federico Nicolás Peccia
Svetlana Pavlitska
Tobias Fleck
Oliver Bringmann
30
0
0
14 Aug 2024
Automatic Generators for a Family of Matrix Multiplication Routines with
  Apache TVM
Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM
Guillermo Alaejos
Adrián Castelló
P. Alonso-Jordá
Francisco D. Igual
Héctor J. Martínez
Enrique S. Quintana-Ortí
16
2
0
31 Oct 2023
Tackling the Matrix Multiplication Micro-kernel Generation with Exo
Tackling the Matrix Multiplication Micro-kernel Generation with Exo
Adrián Castelló
Julian Bellavita
Grace Dinh
Yuka Ikarashi
Héctor J. Martínez
6
4
0
26 Oct 2023
TensorIR: An Abstraction for Automatic Tensorized Program Optimization
TensorIR: An Abstraction for Automatic Tensorized Program Optimization
Siyuan Feng
Bohan Hou
Hongyi Jin
Wuwei Lin
Junru Shao
...
Zihao Ye
Lianmin Zheng
Cody Hao Yu
Yong Yu
Tianqi Chen
28
66
0
09 Jul 2022
Bifrost: End-to-End Evaluation and Optimization of Reconfigurable DNN
  Accelerators
Bifrost: End-to-End Evaluation and Optimization of Reconfigurable DNN Accelerators
Axel Stjerngren
Perry Gibson
José Cano
34
4
0
26 Apr 2022
NNReArch: A Tensor Program Scheduling Framework Against Neural Network
  Architecture Reverse Engineering
NNReArch: A Tensor Program Scheduling Framework Against Neural Network Architecture Reverse Engineering
Yukui Luo
Shijin Duan
Gongye Cheng
Yunsi Fei
Xiaolin Xu
19
8
0
22 Mar 2022
A Highly Configurable Hardware/Software Stack for DNN Inference
  Acceleration
A Highly Configurable Hardware/Software Stack for DNN Inference Acceleration
Suvadeep Banerjee
Steve Burns
P. Cocchini
A. Davare
Shweta Jain
D. Kirkpatrick
A. Sorokin
Jin Yang
Zhenkun Yang
28
9
0
29 Nov 2021
Bandwidth Utilization Side-Channel on ML Inference Accelerators
Bandwidth Utilization Side-Channel on ML Inference Accelerators
Sarbartha Banerjee
Shijia Wei
Prakash Ramrakhyani
Mohit Tiwari
23
3
0
14 Oct 2021
Bring Your Own Codegen to Deep Learning Compiler
Bring Your Own Codegen to Deep Learning Compiler
Zhi Chen
Cody Hao Yu
Trevor Morris
Jorn Tuyls
Yi-Hsiang Lai
Jared Roesch
Elliott Delaye
Vin Sharma
Yida Wang
19
14
0
03 May 2021
Fusion-Catalyzed Pruning for Optimizing Deep Learning on Intelligent
  Edge Devices
Fusion-Catalyzed Pruning for Optimizing Deep Learning on Intelligent Edge Devices
Guangli Li
Xiu Ma
Xueying Wang
Lei Liu
Jingling Xue
Xiaobing Feng
33
33
0
30 Oct 2020
SESAME: Software defined Enclaves to Secure Inference Accelerators with
  Multi-tenant Execution
SESAME: Software defined Enclaves to Secure Inference Accelerators with Multi-tenant Execution
Sarbartha Banerjee
Prakash Ramrakhyani
Shijia Wei
Mohit Tiwari
11
9
0
14 Jul 2020
Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via
  Full-Stack Integration
Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via Full-Stack Integration
Hasan Genç
Seah Kim
Alon Amid
Ameer Haj-Ali
Vighnesh Iyer
...
Ion Stoica
Jonathan Ragan-Kelley
Krste Asanović
B. Nikolić
Y. Shao
17
223
0
22 Nov 2019
SURREAL-System: Fully-Integrated Stack for Distributed Deep
  Reinforcement Learning
SURREAL-System: Fully-Integrated Stack for Distributed Deep Reinforcement Learning
Linxi Fan
Yuke Zhu
Jiren Zhu
Zihua Liu
Orien Zeng
Anchit Gupta
Joan Creus-Costa
Silvio Savarese
Li Fei-Fei
OffRL
GNN
43
3
0
27 Sep 2019
A Data-Center FPGA Acceleration Platform for Convolutional Neural
  Networks
A Data-Center FPGA Acceleration Platform for Convolutional Neural Networks
Xiaoyu Yu
Yuwei Wang
Jie Miao
Ephrem Wu
Heng Zhang
Yu Meng
Bo Zhang
Biao Min
Dewei Chen
Jianlin Gao
33
21
0
17 Sep 2019
DNNVM : End-to-End Compiler Leveraging Heterogeneous Optimizations on
  FPGA-based CNN Accelerators
DNNVM : End-to-End Compiler Leveraging Heterogeneous Optimizations on FPGA-based CNN Accelerators
Yu Xing
Shuang Liang
Lingzhi Sui
Xijie Jia
Jiantao Qiu
Xin Liu
Yushun Wang
Yu Wang
Yi Shan
41
68
0
20 Feb 2019
1