Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.08301
Cited By
Deinsum: Practically I/O Optimal Multilinear Algebra
16 June 2022
A. Ziogas
Grzegorz Kwa'sniewski
Tal Ben-Nun
Timo Schneider
Torsten Hoefler
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deinsum: Practically I/O Optimal Multilinear Algebra"
12 / 12 papers shown
Title
On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal Matrix Factorizations
Grzegorz Kwa'sniewski
Marko Kabić
Tal Ben-Nun
A. Ziogas
Jens Eirik Saethre
...
Timo Schneider
Maciej Besta
Anton Kozhevnikov
J. VandeVondele
Torsten Hoefler
63
15
0
20 Aug 2021
Productivity, Portability, Performance: Data-Centric Python
Yiheng Wang
Yao Zhang
Yanzhang Wang
Yan Wan
Jiao Wang
Zhongyuan Wu
Yuhao Yang
Bowen She
122
97
0
01 Jul 2021
HASCO: Towards Agile HArdware and Software CO-design for Tensor Computation
Qingcheng Xiao
Wenlei Bao
Bingzhe Wu
Pengcheng Xu
Xuehai Qian
Yun Liang
101
68
0
04 May 2021
Array Programming with NumPy
Charles R. Harris
K. Millman
S. Walt
R. Gommers
Pauli Virtanen
...
Tyler Reddy
Warren Weckesser
Hameer Abbasi
C. Gohlke
T. Oliphant
156
14,986
0
18 Jun 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
520
42,559
0
03 Dec 2019
Red-blue pebbling revisited: near optimal parallel matrix-matrix multiplication
Grzegorz Kwa'sniewski
Marko Kabić
Maciej Besta
J. VandeVondele
R. Solcà
Torsten Hoefler
LRM
50
93
0
26 Aug 2019
Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous Architectures
Tal Ben-Nun
Johannes de Fine Licht
A. Ziogas
Timo Schneider
Torsten Hoefler
84
103
0
27 Feb 2019
Communication-avoiding Cholesky-QR2 for rectangular matrices
Edward Hutter
Edgar Solomonik
34
8
0
23 Oct 2017
Communication Lower Bounds for Matricized Tensor Times Khatri-Rao Product
Grey Ballard
Nicholas Knight
Kathryn Rouse
47
31
0
24 Aug 2017
On Optimizing Distributed Tucker Decomposition for Dense Tensors
Venkatesan T. Chakaravarthy
Jee W. Choi
Douglas J. Joseph
Xing Liu
Prakash Murali
Yogish Sabharwal
D. Sreedhar
41
30
0
18 Jul 2017
HPTT: A High-Performance Tensor Transposition C++ Library
P. Springer
Tong Su
Paolo Bientinesi
39
50
0
14 Apr 2017
Communication-Optimal Parallel Algorithm for Strassen's Matrix Multiplication
Grey Ballard
J. Demmel
Olga Holtz
Benjamin Lipshitz
O. Schwartz
69
137
0
14 Feb 2012
1