Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1802.04730
Cited By
Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions
13 February 2018
Nicolas Vasilache
O. Zinenko
Theodoros Theodoridis
Priya Goyal
Zach DeVito
William S. Moses
Sven Verdoolaege
Andrew Adams
Albert Cohen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions"
50 / 144 papers shown
Title
Tilus: A Virtual Machine for Arbitrary Low-Precision GPGPU Computation in LLM Serving
Yaoyao Ding
Bohan Hou
X. Zhang
Allan Lin
Tianqi Chen
Cody Yu Hao
Yida Wang
Gennady Pekhimenko
41
0
0
17 Apr 2025
Scheduling Languages: A Past, Present, and Future Taxonomy
Mary Hall
Cosmin Oancea
Anne C. Elster
Ari Rasch
Sameeran Joshi
Amir Mohammad Tavakkoli
Richard Schulze
26
1
0
25 Oct 2024
Tadashi: Enabling AI-Based Automated Code Generation With Guaranteed Correctness
Emil Vatai
Aleksandr Drozd
Ivan R. Ivanov
Yinghao Ren
M. Wahib
39
1
0
04 Oct 2024
CoolerSpace: A Language for Physically Correct and Computationally Efficient Color Programming
Ethan Chen
Jiwon Chang
Yuhao Zhu
21
0
0
04 Sep 2024
Stream-K++: Adaptive GPU GEMM Kernel Scheduling and Selection using Bloom Filters
Harisankar Sadasivan
Muhammad Osama
Maksim Podkorytov
Carlus Huang
Jun Liu
20
0
0
21 Aug 2024
Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor with T10
Yiqi Liu
Yuqi Xue
Yu Cheng
Lingxiao Ma
Ziming Miao
Jilong Xue
Jian Huang
GNN
21
1
0
09 Aug 2024
Dynamic Co-Optimization Compiler: Leveraging Multi-Agent Reinforcement Learning for Enhanced DNN Accelerator Performance
Arya Fayyazi
M. Kamal
Massoud Pedram
26
0
0
11 Jul 2024
Composing Distributed Computations Through Task and Kernel Fusion
Rohan Yadav
S. Sundram
Wonchan Lee
Michael Garland
Michael Bauer
Alex Aiken
Fredrik Kjolstad
33
1
0
26 Jun 2024
Scorch: A Library for Sparse Deep Learning
Bobby Yan
Alexander J. Root
Trevor Gale
David Broman
Fredrik Kjolstad
25
0
0
27 May 2024
Graph neural networks with configuration cross-attention for tensor compilers
Dmitrii Khizbullin
Eduardo Rocha de Andrade
Thanh Hau Nguyen
Matheus Pedroza Ferreira
David R. Pugh
GNN
21
0
0
26 May 2024
Allo: A Programming Model for Composable Accelerator Design
Hongzheng Chen
Niansong Zhang
Shaojie Xiang
Zhichen Zeng
Mengjia Dai
Zhiru Zhang
41
14
0
07 Apr 2024
LOOPer: A Learned Automatic Code Optimizer For Polyhedral Compilers
Massinissa Merouani
Khaled Afif Boudaoud
Iheb Nassim Aouadj
Nassim Tchoulak
Islam Kara Bernou
Hamza Benyamina
F. B. Tayeb
K. Benatchba
Hugh Leather
Riyadh Baghdadi
37
2
0
18 Mar 2024
SoD
2
^2
2
: Statically Optimizing Dynamic Deep Neural Network
Wei Niu
Gagan Agrawal
Bin Ren
30
4
0
29 Feb 2024
Unraveling the Key of Machine Learning Solutions for Android Malware Detection
Jiahao Liu
Jun Zeng
Fabio Pierazzi
Lorenzo Cavallaro
Zhenkai Liang
AAML
18
7
0
05 Feb 2024
CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory Accelerators
Songyun Qu
Shixin Zhao
Bing Li
Yintao He
Xuyi Cai
Lei Zhang
Ying Wang
16
4
0
23 Jan 2024
Fast Kronecker Matrix-Matrix Multiplication on GPUs
Abhinav Jangda
Mohit Yadav
19
2
0
18 Jan 2024
PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler
Gianpietro Consolaro
Zhen Zhang
Harenome Razanajato
Nelson Lossing
Nassim Tchoulak
...
Artur Cesar Araujo Alves
Renwei Zhang
Denis Barthou
Corinne Ancourt
Cédric Bastoul
13
3
0
12 Jan 2024
conv_einsum: A Framework for Representation and Fast Evaluation of Multilinear Operations in Convolutional Tensorial Neural Networks
Tahseen Rabbani
Jiahao Su
Xiaoyu Liu
David Chan
Geoffrey Sangston
Furong Huang
29
1
0
07 Jan 2024
GraphRARE: Reinforcement Learning Enhanced Graph Neural Network with Relative Entropy
Tianhao Peng
Wenjun Wu
Haitao Yuan
Zhifeng Bao
Pengrui Zhao
Xin Yu
Xuetao Lin
Yu Liang
Yanjun Pu
35
10
0
15 Dec 2023
Packrat: Automatic Reconfiguration for Latency Minimization in CPU-based DNN Serving
Ankit Bhardwaj
Amar Phanishayee
Deepak Narayanan
Mihail Tarta
Ryan Stutsman
11
2
0
30 Nov 2023
A Compiler from Array Programs to Vectorized Homomorphic Encryption
Rolph Recto
Andrew C. Myers
11
1
0
10 Nov 2023
Restoring the Broken Covenant Between Compilers and Deep Learning Accelerators
Sean Kinzer
Soroush Ghodrati
R. Mahapatra
Byung Hoon Ahn
Edwin Mascarenhas
Xiaolong Li
J. Matai
Liang Zhang
H. Esmaeilzadeh
25
2
0
27 Oct 2023
Serving Deep Learning Model in Relational Databases
Alexandre Eichenberger
Qi Lin
Saif Masood
Hong Min
Alexander Sim
...
Yida Wang
Kesheng Wu
Binhang Yuan
Lixi Zhou
Jia Zou
15
0
0
07 Oct 2023
YFlows: Systematic Dataflow Exploration and Code Generation for Efficient Neural Network Inference using SIMD Architectures on CPUs
Cyrus Zhou
Zack Hassman
Ruize Xu
Dhirpal Shah
Vaughn Richard
Yanjing Li
29
1
0
01 Oct 2023
LoopTune: Optimizing Tensor Computations with Reinforcement Learning
Dejan Grubisic
Bram Wasti
Chris Cummins
John Mellor-Crummey
A. Zlateski
16
0
0
04 Sep 2023
Saturn: An Optimized Data System for Large Model Deep Learning Workloads
Kabir Nagrecha
Arun Kumar
11
6
0
03 Sep 2023
Target-independent XLA optimization using Reinforcement Learning
Milan Ganai
Haichen Li
Theodore Enns
Yida Wang
Randy Huang
32
0
0
28 Aug 2023
TpuGraphs: A Performance Prediction Dataset on Large Tensor Computational Graphs
P. Phothilimthana
Sami Abu-El-Haija
Kaidi Cao
Bahare Fatemi
Mike Burrows
Charith Mendis
Bryan Perozzi
GNN
AI4TS
25
17
0
25 Aug 2023
MARS: Exploiting Multi-Level Parallelism for DNN Workloads on Adaptive Multi-Accelerator Systems
Guan Shen
Jieru Zhao
Zeke Wang
Zhehan Lin
Wenchao Ding
Chentao Wu
Quan Chen
Minyi Guo
26
4
0
23 Jul 2023
Maximum Flows in Parametric Graph Templates
Tal Ben-Nun
Lukas Gianinazzi
Torsten Hoefler
Yishai Oltchik
8
0
0
17 Jul 2023
Bridging Control-Centric and Data-Centric Optimization
Tal Ben-Nun
Berke Ates
A. Calotoiu
Torsten Hoefler
23
7
0
01 Jun 2023
AMULET: Adaptive Matrix-Multiplication-Like Tasks
Junyoung Kim
Kenneth Ross
Eric Sedlar
Lukas Stadler
11
1
0
12 May 2023
Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures
E. Georganas
Dhiraj D. Kalamkar
K. Voronin
Abhisek Kundu
Antonio Noack
Hans Pabst
Alexander Breuer
A. Heinecke
11
2
0
25 Apr 2023
Full Stack Optimization of Transformer Inference: a Survey
Sehoon Kim
Coleman Hooper
Thanakul Wattanawong
Minwoo Kang
Ruohan Yan
...
Qijing Huang
Kurt Keutzer
Michael W. Mahoney
Y. Shao
A. Gholami
MQ
28
100
0
27 Feb 2023
Operator Fusion in XLA: Analysis and Evaluation
Danielle Snider
Ruofan Liang
16
4
0
30 Jan 2023
oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning Compilation
Jianhui Li
Zhennan Qin
Yijie Mei
Jingze Cui
Yunfei Song
...
Baihui Jin
Yan Zhang
Jason Ye
Eric Lin
Daniel M. Lavery
GNN
14
8
0
03 Jan 2023
AGO: Boosting Mobile AI Inference Performance by Removing Constraints on Graph Optimization
Zhiying Xu
H. Peng
Wei Wang
GNN
24
3
0
02 Dec 2022
AlphaSparse: Generating High Performance SpMV Codes Directly from Sparse Matrices
Zhen Du
Jiajia Li
Yinshan Wang
Xueqi Li
Guangming Tan
N. Sun
9
21
0
07 Nov 2022
TLP: A Deep Learning-based Cost Model for Tensor Program Tuning
Yiqiang Zhai
Yu Zhang
Shuo Liu
Xiaomeng Chu
Jie Peng
Jianmin Ji
Yanyong Zhang
20
29
0
07 Nov 2022
Legal-Tech Open Diaries: Lesson learned on how to develop and deploy light-weight models in the era of humongous Language Models
Stelios Maroudas
Sotiris Legkas
Prodromos Malakasiotis
Ilias Chalkidis
VLM
AILaw
ALM
ELM
25
4
0
24 Oct 2022
ALT: Boosting Deep Learning Performance by Breaking the Wall between Graph and Operator Level Optimizations
Zhiying Xu
Jiafan Xu
H. Peng
Wei Wang
Xiaoliang Wang
...
Haipeng Dai
Yixu Xu
Hao Cheng
Kun Wang
Guihai Chen
18
0
0
22 Oct 2022
Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor Programs
Yaoyao Ding
Cody Hao Yu
Bojian Zheng
Yizhi Liu
Yida Wang
Gennady Pekhimenko
21
30
0
18 Oct 2022
Demystifying Map Space Exploration for NPUs
Sheng-Chun Kao
A. Parashar
Po-An Tsai
T. Krishna
30
11
0
07 Oct 2022
Decompiling x86 Deep Neural Network Executables
Zhibo Liu
Yuanyuan Yuan
Shuai Wang
Xiaofei Xie
L. Ma
AAML
31
13
0
03 Oct 2022
Optimizing DNN Compilation for Distributed Training with Joint OP and Tensor Fusion
Xiaodong Yi
Shiwei Zhang
Lansong Diao
Chuan Wu
Zhen Zheng
Shiqing Fan
Siyu Wang
Jun Yang
W. Lin
18
4
0
26 Sep 2022
SpDISTAL: Compiling Distributed Sparse Tensor Computations
Rohan Yadav
Alex Aiken
Fredrik Kjolstad
MoE
9
10
0
28 Jul 2022
TensorIR: An Abstraction for Automatic Tensorized Program Optimization
Siyuan Feng
Bohan Hou
Hongyi Jin
Wuwei Lin
Junru Shao
...
Zihao Ye
Lianmin Zheng
Cody Hao Yu
Yong Yu
Tianqi Chen
15
65
0
09 Jul 2022
Tensor Program Optimization with Probabilistic Programs
Junru Shao
Xiyou Zhou
Siyuan Feng
Bohan Hou
Ruihang Lai
Hongyi Jin
Wuwei Lin
Masahiro Masuda
Cody Hao Yu
Tianqi Chen
17
28
0
26 May 2022
Neural Architecture Search using Property Guided Synthesis
Charles Jin
P. Phothilimthana
Sudip Roy
25
6
0
08 May 2022
DISTAL: The Distributed Tensor Algebra Compiler
Rohan Yadav
A. Aiken
Fredrik Kjolstad
11
29
0
15 Mar 2022
1
2
3
Next