Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1412.7580
Cited By
Fast Convolutional Nets With fbfft: A GPU Performance Evaluation
24 December 2014
Nicolas Vasilache
Jeff Johnson
Michaël Mathieu
Soumith Chintala
Serkan Piantino
Yann LeCun
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fast Convolutional Nets With fbfft: A GPU Performance Evaluation"
50 / 53 papers shown
Title
Underwater object detection in sonar imagery with detection transformer and Zero-shot neural architecture search
XiaoTong Gu
Shengyu Tang
Yiming Cao
Changdong Yu
ViT
36
0
0
10 May 2025
AdaOPC: A Self-Adaptive Mask Optimization Framework For Real Design Patterns
Wenqian Zhao
Xufeng Yao
Ziyang Yu
Guojin Chen
Yuzhe Ma
Bei Yu
Martin D. F. Wong
30
17
0
15 Mar 2023
Pruning Very Deep Neural Network Channels for Efficient Inference
Yihui He
35
1
0
14 Nov 2022
Testing predictions of representation cost theory with CNNs
Charles Godfrey
Elise Bishoff
Myles Mckay
Davis Brown
Grayson Jorgenson
Henry Kvinge
E. Byler
34
0
0
03 Oct 2022
Fast-FNet: Accelerating Transformer Encoder Models via Efficient Fourier Layers
Nurullah Sevim
Ege Ozan Özyedek
Furkan Şahinuç
Aykut Koç
40
11
0
26 Sep 2022
Learning Convolutional Neural Networks in the Frequency Domain
H. Pan
Yixin Chen
Xin-Yi Niu
Wenbo Zhou
Dongsheng Li
OOD
25
9
0
14 Apr 2022
NUMA-aware FFT-based Convolution on ARMv8 Many-core CPUs
Xiandong Huang
Qinglin Wang
Shuyu Lu
Ruochen Hao
Songzhu Mei
Jie Liu
21
7
0
25 Sep 2021
Efficient and Generic 1D Dilated Convolution Layer for Deep Learning
Narendra Chaudhary
Sanchit Misra
Dhiraj D. Kalamkar
A. Heinecke
E. Georganas
Barukh Ziv
Menachem Adelman
Bharat Kaul
32
9
0
16 Apr 2021
Scaling Local Self-Attention for Parameter Efficient Visual Backbones
Ashish Vaswani
Prajit Ramachandran
A. Srinivas
Niki Parmar
Blake A. Hechtman
Jonathon Shlens
30
395
0
23 Mar 2021
Deep Networks from the Principle of Rate Reduction
Kwan Ho Ryan Chan
Yaodong Yu
Chong You
Haozhi Qi
John N. Wright
Yi Ma
22
21
0
27 Oct 2020
Exploring Sparsity in Image Super-Resolution for Efficient Inference
Longguang Wang
Xiaoyu Dong
Yingqian Wang
Xinyi Ying
Zaiping Lin
W. An
Yulan Guo
SupR
34
4
0
17 Jun 2020
ZynqNet: An FPGA-Accelerated Embedded Convolutional Neural Network
David Gschwend
35
64
0
14 May 2020
Machine Learning enabled Spectrum Sharing in Dense LTE-U/Wi-Fi Coexistence Scenarios
Adam Dziedzic
V. Sathya
M. I. Rochman
M. Ghosh
S. Krishnan
24
19
0
18 Mar 2020
Exploiting Verified Neural Networks via Floating Point Numerical Error
Kai Jia
Martin Rinard
AAML
37
34
0
06 Mar 2020
Depth-wise Decomposition for Accelerating Separable Convolutions in Efficient Convolutional Neural Networks
Yihui He
Jianing Qian
Jianren Wang
Cindy X. Le
Congrui Hetang
Qi Lyu
Wenping Wang
Tianwei Yue
50
11
0
21 Oct 2019
Global Sparse Momentum SGD for Pruning Very Deep Neural Networks
Xiaohan Ding
Guiguang Ding
Xiangxin Zhou
Yuchen Guo
Jungong Han
Ji Liu
18
162
0
27 Sep 2019
A Winograd-based Integrated Photonics Accelerator for Convolutional Neural Networks
A. Mehrabian
M. Miscuglio
Yousra Alkabani
V. Sorger
T. El-Ghazawi
27
46
0
25 Jun 2019
Object Detection in 20 Years: A Survey
Zhengxia Zou
Keyan Chen
Zhenwei Shi
Yuhong Guo
Jieping Ye
VLM
ObjD
AI4TS
34
2,290
0
13 May 2019
MobiVSR: A Visual Speech Recognition Solution for Mobile Devices
Nilay Shrivastava
Astitwa Saxena
Yaman Kumar Singla
Preeti Kaur
Debanjan Mahata
R. Shah
27
3
0
10 May 2019
A detailed comparative study of open source deep learning frameworks
Ghadeer Al-Bdour
Raffi Al-Qurran
M. Al-Ayyoub
A. Shatnawi
ELM
21
13
0
25 Feb 2019
Towards Compact ConvNets via Structure-Sparsity Regularized Filter Pruning
Shaohui Lin
Rongrong Ji
Yuchao Li
Cheng Deng
Xuelong Li
35
70
0
23 Jan 2019
Efficient Winograd Convolution via Integer Arithmetic
Lingchuan Meng
J. Brothers
16
29
0
07 Jan 2019
High Performance Zero-Memory Overhead Direct Convolutions
Jiyuan Zhang
F. Franchetti
Tze Meng Low
17
68
0
20 Sep 2018
2PFPCE: Two-Phase Filter Pruning Based on Conditional Entropy
Chuhan Min
Aosen Wang
Yiran Chen
Wenyao Xu
Xin Chen
30
41
0
06 Sep 2018
Auto Deep Compression by Reinforcement Learning Based Actor-Critic Structure
Hamed Hakkak
OffRL
AI4CE
15
1
0
08 Jul 2018
Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine
Renzo Andri
Lukas Cavigelli
D. Rossi
Luca Benini
MQ
24
19
0
05 Mar 2018
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
Tal Ben-Nun
Torsten Hoefler
GNN
33
704
0
26 Feb 2018
Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions
Nicolas Vasilache
O. Zinenko
Theodoros Theodoridis
Priya Goyal
Zach DeVito
William S. Moses
Sven Verdoolaege
Andrew Adams
Albert Cohen
40
432
0
13 Feb 2018
AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Yihui He
Ji Lin
Zhijian Liu
Hanrui Wang
Li Li
Song Han
35
1,343
0
10 Feb 2018
High performance ultra-low-precision convolutions on mobile devices
Andrew Tulloch
Yangqing Jia
HAI
MQ
18
27
0
06 Dec 2017
Distributed Training Large-Scale Deep Architectures
Shang-Xuan Zou
Chun-Yen Chen
Jui-Lin Wu
Chun-Nan Chou
Chia-Chin Tsao
Kuan-Chieh Tung
Ting-Wei Lin
Cheng-Lung Sung
Edward Y. Chang
26
22
0
10 Aug 2017
Channel Pruning for Accelerating Very Deep Neural Networks
Yihui He
Xiangyu Zhang
Jian Sun
119
2,508
0
19 Jul 2017
MEC: Memory-efficient Convolution for Deep Neural Network
Minsik Cho
D. Brand
21
86
0
21 Jun 2017
Network Sketching: Exploiting Binary Structure in Deep CNNs
Yiwen Guo
Anbang Yao
Hao Zhao
Yurong Chen
MQ
37
95
0
07 Jun 2017
Optimizing Memory Efficiency for Convolution Kernels on Kepler GPUs
Xiaoming Chen
Jianxu Chen
Danny Chen
X. S. Hu
27
10
0
29 May 2017
Deep Learning in the Automotive Industry: Applications and Tools
André Luckow
M. Cook
Nathan Ashcraft
Edwin Weill
Emil Djerekarov
Bennie Vorster
28
116
0
30 Apr 2017
HPTT: A High-Performance Tensor Transposition C++ Library
P. Springer
Tong Su
Paolo Bientinesi
16
49
0
14 Apr 2017
CBinfer: Change-Based Inference for Convolutional Neural Networks on Video Data
Lukas Cavigelli
Philippe Degen
Luca Benini
BDL
25
51
0
14 Apr 2017
Exploring the Design Space of Deep Convolutional Neural Networks at Large Scale
F. Iandola
3DV
26
18
0
20 Dec 2016
Faster CNNs with Direct Sparse Convolutions and Guided Pruning
Jongsoo Park
Sheng Li
W. Wen
P. T. P. Tang
Hai Helen Li
Yiran Chen
Pradeep Dubey
39
182
0
04 Aug 2016
Omnivore: An Optimizer for Multi-device Deep Learning on CPUs and GPUs
Stefan Hadjis
Ce Zhang
Ioannis Mitliagkas
Dan Iter
Christopher Ré
20
65
0
14 Jun 2016
Optimizing Performance of Recurrent Neural Networks on GPUs
J. Appleyard
Tomás Kociský
Phil Blunsom
25
91
0
07 Apr 2016
TTC: A high-performance Compiler for Tensor Transpositions
P. Springer
J. Hammond
Paolo Bientinesi
25
17
0
07 Mar 2016
Very Efficient Training of Convolutional Neural Networks using Fast Fourier Transform and Overlap-and-Add
Tyler Highlander
Andres Rodriguez
26
59
0
25 Jan 2016
On the energy landscape of deep networks
Pratik Chaudhari
Stefano Soatto
ODL
43
27
0
20 Nov 2015
Comparative Study of Deep Learning Software Frameworks
S. Bahrampour
Naveen Ramakrishnan
Lukas Schott
Mohak Shah
29
161
0
19 Nov 2015
FireCaffe: near-linear acceleration of deep neural network training on compute clusters
F. Iandola
Khalid Ashraf
Matthew W. Moskewicz
Kurt Keutzer
30
302
0
31 Oct 2015
Compact Convolutional Neural Network Cascade for Face Detection
Ilya Kalinowski
V. Spitsyn
CVBM
24
35
0
06 Aug 2015
Deep Convolutional Networks on Graph-Structured Data
Mikael Henaff
Joan Bruna
Yann LeCun
GNN
21
1,581
0
16 Jun 2015
Compressing Convolutional Neural Networks
Wenlin Chen
James T. Wilson
Stephen Tyree
Kilian Q. Weinberger
Yixin Chen
26
139
0
14 Jun 2015
1
2
Next