Parallel Multi Channel Convolution using General Matrix Multiplication

6 April 2017

Papers citing "Parallel Multi Channel Convolution using General Matrix Multiplication"

20 / 20 papers shown

Title
BF-IMNA: A Bit Fluid In-Memory Neural Architecture for Neural Network Acceleration M. Rakka Rachid Karami A. Eltawil M. Fouda Fadi J. Kurdahi MQ 47 1 0 03 Nov 2024
Optimizing Sparse Convolution on GPUs with CUDA for 3D Point Cloud Processing in Embedded Systems Chester Luo Kevin Lai 3DPC 36 0 0 12 Feb 2024
All Rivers Run to the Sea: Private Learning with Asymmetric Flows Yue Niu Ramy E. Ali Saurav Prakash Salman Avestimehr FedML 38 2 0 05 Dec 2023
Sliding Window Sum Algorithms for Deep Neural Networks R. Snytsar TPM AI4TS 28 3 0 25 May 2023
OLLIE: Derivation-based Tensor Program Optimizer Liyan Zheng Haojie Wang Jidong Zhai Muyan Hu Zixuan Ma Tuowei Wang Shizhi Tang Lei Xie Kezhao Huang Zhihao Jia 46 3 0 02 Aug 2022
Lipschitz Bound Analysis of Neural Networks S. Bose AAML 42 0 0 14 Jul 2022
Neuro-Symbolic AI: An Emerging Class of AI Workloads and their Characterization Zachary Susskind Bryce Arden L. John Patrick A Stockton E. John NAI 30 41 0 13 Sep 2021
Content-Aware Convolutional Neural Networks Yong Guo Yaofo Chen Mingkui Tan Kui Jia Jian Chen Jingdong Wang 36 8 0 30 Jun 2021
Post-Training Sparsity-Aware Quantization Gil Shomron F. Gabbay Samer Kurzum U. Weiser MQ 46 33 0 23 May 2021
Efficient and Generic 1D Dilated Convolution Layer for Deep Learning Narendra Chaudhary Sanchit Misra Dhiraj D. Kalamkar A. Heinecke E. Georganas Barukh Ziv Menachem Adelman Bharat Kaul 32 9 0 16 Apr 2021
Extending Sparse Tensor Accelerators to Support Multiple Compression Formats Eric Qin Geonhwa Jeong William Won Sheng-Chun Kao Hyoukjun Kwon Sudarshan Srinivasan Dipankar Das G. Moon S. Rajamanickam T. Krishna 35 18 0 18 Mar 2021
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead Maurizio Capra Beatrice Bussolino Alberto Marchisio Guido Masera Maurizio Martina Mohamed Bennai BDL 64 140 0 21 Dec 2020
Accelerating Sparse Matrix-Matrix Multiplication with GPU Tensor Cores Orestis Zachariadis Nitin Satpute Juan Gómez Luna J. Olivares 22 60 0 29 Sep 2020
ICA-UNet: ICA Inspired Statistical UNet for Real-time 3D Cardiac Cine MRI Segmentation Tianchen Wang Xiaowei Xu Jinjun Xiong Qianjun Jia Haiyun Yuan Meiping Huang Jian Zhuang Yiyu Shi 22 21 0 18 Jul 2020
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights Shail Dave Riyadh Baghdadi Tony Nowatzki Sasikanth Avancha Aviral Shrivastava Baoxin Li 64 82 0 02 Jul 2020
FPGA-based Accelerators of Deep Learning Networks for Learning and Classification: A Review Ahmad Shawahna S. M. Sait A. El-Maleh 28 372 0 01 Jan 2019
Anatomy Of High-Performance Deep Learning Convolutions On SIMD Architectures E. Georganas Sasikanth Avancha K. Banerjee Dhiraj D. Kalamkar G. Henry Hans Pabst A. Heinecke BDL 22 105 0 16 Aug 2018
A model-driven approach for a new generation of adaptive libraries Marco Cianfriglia Damiano Perri C. Nugteren Anton Lokhmotov G. Fursin 29 14 0 19 Jun 2018
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis Tal Ben-Nun Torsten Hoefler GNN 33 704 0 26 Feb 2018
Optimal DNN Primitive Selection with Partitioned Boolean Quadratic Programming Andrew Anderson David Gregg 32 34 0 03 Oct 2017