Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1805.08166
Cited By
Learning to Optimize Tensor Programs
21 May 2018
Tianqi Chen
Lianmin Zheng
Eddie Q. Yan
Ziheng Jiang
T. Moreau
Luis Ceze
Carlos Guestrin
Arvind Krishnamurthy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to Optimize Tensor Programs"
50 / 145 papers shown
Title
TLP: A Deep Learning-based Cost Model for Tensor Program Tuning
Yiqiang Zhai
Yu Zhang
Shuo Liu
Xiaomeng Chu
Jie Peng
Jianmin Ji
Yanyong Zhang
30
30
0
07 Nov 2022
Rethinking Storage Management for Data Processing Pipelines in Cloud Data Centers
Ubaid Ullah Hafeez
Martin Maas
Mustafa Uysal
Richard McDougall
18
0
0
04 Nov 2022
Exploring Effects of Computational Parameter Changes to Image Recognition Systems
Nikolaos Louloudakis
Perry Gibson
José Cano
A. Rajan
24
6
0
01 Nov 2022
ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for AI-GPUs
Guyue Huang
Yang Bai
L. Liu
Yuke Wang
Bei Yu
Yufei Ding
Yuan Xie
52
16
0
29 Oct 2022
ALT: Boosting Deep Learning Performance by Breaking the Wall between Graph and Operator Level Optimizations
Zhiying Xu
Jiafan Xu
H. Peng
Wei Wang
Xiaoliang Wang
...
Haipeng Dai
Yixu Xu
Hao Cheng
Kun Wang
Guihai Chen
42
0
0
22 Oct 2022
Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor Programs
Yaoyao Ding
Cody Hao Yu
Bojian Zheng
Yizhi Liu
Yida Wang
Gennady Pekhimenko
29
30
0
18 Oct 2022
Demystifying Map Space Exploration for NPUs
Sheng-Chun Kao
A. Parashar
Po-An Tsai
T. Krishna
40
11
0
07 Oct 2022
Decompiling x86 Deep Neural Network Executables
Zhibo Liu
Yuanyuan Yuan
Shuai Wang
Xiaofei Xie
Lei Ma
AAML
45
13
0
03 Oct 2022
SONAR: Joint Architecture and System Optimization Search
Elias Jääsaari
Michelle Ma
Ameet Talwalkar
Tianqi Chen
43
1
0
25 Aug 2022
OLLIE: Derivation-based Tensor Program Optimizer
Liyan Zheng
Haojie Wang
Jidong Zhai
Muyan Hu
Zixuan Ma
Tuowei Wang
Shizhi Tang
Lei Xie
Kezhao Huang
Zhihao Jia
46
3
0
02 Aug 2022
NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers
Jiawei Liu
Jinkun Lin
Fabian Ruffy
Cheng Tan
Jinyang Li
Aurojit Panda
Lingming Zhang
76
57
0
26 Jul 2022
SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning
Zihao Ye
Ruihang Lai
Junru Shao
Tianqi Chen
Luis Ceze
78
93
0
11 Jul 2022
TensorIR: An Abstraction for Automatic Tensorized Program Optimization
Siyuan Feng
Bohan Hou
Hongyi Jin
Wuwei Lin
Junru Shao
...
Zihao Ye
Lianmin Zheng
Cody Hao Yu
Yong Yu
Tianqi Chen
28
66
0
09 Jul 2022
CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution
Taeho Kim
Yongin Kwon
Jemin Lee
Taeho Kim
Sangtae Ha
35
2
0
04 Jul 2022
Productive Reproducible Workflows for DNNs: A Case Study for Industrial Defect Detection
Perry Gibson
José Cano
AI4CE
38
1
0
19 Jun 2022
HW-Aware Initialization of DNN Auto-Tuning to Improve Exploration Time and Robustness
D. Rieber
Moritz Reiber
Oliver Bringmann
Holger Fröning
29
4
0
31 May 2022
Tensor Program Optimization with Probabilistic Programs
Junru Shao
Xiyou Zhou
Siyuan Feng
Bohan Hou
Ruihang Lai
Hongyi Jin
Wuwei Lin
Masahiro Masuda
Cody Hao Yu
Tianqi Chen
37
29
0
26 May 2022
LoopStack: a Lightweight Tensor Algebra Compiler Stack
Bram Wasti
J. Cambronero
Benoit Steiner
Hugh Leather
A. Zlateski
8
3
0
02 May 2022
Bifrost: End-to-End Evaluation and Optimization of Reconfigurable DNN Accelerators
Axel Stjerngren
Perry Gibson
José Cano
34
4
0
26 Apr 2022
Learning from distinctive candidates to optimize reduced-precision convolution program on tensor cores
Junkyeong Choi
Hyucksung Kwon
W. Lee
Jungwook Choi
Jieun Lim
26
0
0
11 Feb 2022
Flashlight: Enabling Innovation in Tools for Machine Learning
Jacob Kahn
Vineel Pratap
Tatiana Likhomanenko
Qiantong Xu
Awni Y. Hannun
...
Gilad Avidov
Benoit Steiner
Vitaliy Liptchinsky
Gabriel Synnaeve
R. Collobert
32
28
0
29 Jan 2022
DNNFuser: Generative Pre-Trained Transformer as a Generalized Mapper for Layer Fusion in DNN Accelerators
Sheng-Chun Kao
Xiaoyu Huang
T. Krishna
AI4CE
40
9
0
26 Jan 2022
VELTAIR: Towards High-Performance Multi-tenant Deep Learning Services via Adaptive Compilation and Scheduling
Zihan Liu
Jingwen Leng
Zhihui Zhang
Quan Chen
Chao Li
Minyi Guo
27
46
0
17 Jan 2022
Moses: Efficient Exploitation of Cross-device Transferable Features for Tensor Program Optimization
Zhihe Zhao
Xian Shuai
Yang Bai
Neiwen Ling
Nan Guan
Zhenyu Yan
Guoliang Xing
33
6
0
15 Jan 2022
Transfer-Tuning: Reusing Auto-Schedules for Efficient Tensor Program Code Generation
Perry Gibson
José Cano
33
12
0
14 Jan 2022
BoGraph: Structured Bayesian Optimization From Logs for Expensive Systems with Many Parameters
Sami Alabed
Eiko Yoneki
23
7
0
16 Dec 2021
A Highly Configurable Hardware/Software Stack for DNN Inference Acceleration
Suvadeep Banerjee
Steve Burns
P. Cocchini
A. Davare
Shweta Jain
D. Kirkpatrick
A. Sorokin
Jin Yang
Zhenkun Yang
28
9
0
29 Nov 2021
Collage: Seamless Integration of Deep Learning Backends with Automatic Placement
Byungsoo Jeon
Sunghyun Park
Peiyuan Liao
Sheng Xu
Tianqi Chen
Zhihao Jia
VLM
41
4
0
01 Nov 2021
Characterizing and Taming Resolution in Convolutional Neural Networks
Eddie Q. Yan
Liang Luo
Luis Ceze
31
0
0
28 Oct 2021
Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance
Jiarong Xing
Leyuan Wang
Shang Zhang
Jack H Chen
Ang Chen
Yibo Zhu
35
43
0
25 Oct 2021
The CoRa Tensor Compiler: Compilation for Ragged Tensors with Minimal Padding
Pratik Fegade
Tianqi Chen
Phillip B. Gibbons
T. Mowry
25
29
0
19 Oct 2021
SoftNeuro: Fast Deep Inference using Multi-platform Optimization
Masaki Hilaga
Yasuhiro Kuroda
Hitoshi Matsuo
Tatsuya Kawaguchi
Gabriel Ogawa
Hiroshi Miyake
Yusuke Kozawa
26
1
0
12 Oct 2021
SECDA: Efficient Hardware/Software Co-Design of FPGA-based DNN Accelerators for Edge Inference
Jude Haris
Perry Gibson
José Cano
Nicolas Bohm Agostini
David Kaeli
44
19
0
01 Oct 2021
Learning to Superoptimize Real-world Programs
Alex Shypula
Pengcheng Yin
Jeremy Lacomis
Claire Le Goues
Edward N. Schwartz
Graham Neubig
NAI
121
10
0
28 Sep 2021
DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion
Wei Niu
Jiexiong Guan
Yanzhi Wang
G. Agrawal
Bin Ren
AI4CE
32
147
0
30 Aug 2021
Using Graph Neural Networks to model the performance of Deep Neural Networks
Shikhar Singh
Benoit Steiner
James Hegarty
Hugh Leather
GNN
22
3
0
27 Aug 2021
AIRCHITECT: Learning Custom Architecture Design and Mapping Space
A. Samajdar
J. Joseph
Matthew Denton
T. Krishna
33
7
0
16 Aug 2021
NeurObfuscator: A Full-stack Obfuscation Tool to Mitigate Neural Architecture Stealing
Jingtao Li
Zhezhi He
Adnan Siraj Rakin
Deliang Fan
C. Chakrabarti
29
24
0
20 Jul 2021
FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks
Sheng-Chun Kao
Suvinay Subramanian
Gaurav Agrawal
Amir Yazdanbakhsh
T. Krishna
51
58
0
13 Jul 2021
RHNAS: Realizable Hardware and Neural Architecture Search
Yash Akhauri
Adithya Niranjan
J. P. Muñoz
Suvadeep Banerjee
A. Davare
P. Cocchini
A. Sorokin
R. Iyer
Nilesh Jain
27
3
0
17 Jun 2021
NAAS: Neural Accelerator Architecture Search
Chengyue Wu
Mengtian Yang
Song Han
34
60
0
27 May 2021
Customized Monte Carlo Tree Search for LLVM/Polly's Composable Loop Optimization Transformations
Jaehoon Koo
Prasanna Balaprakash
Michael Kruse
Xingfu Wu
P. Hovland
Mary W. Hall
33
7
0
10 May 2021
HASCO: Towards Agile HArdware and Software CO-design for Tensor Computation
Qingcheng Xiao
Wenlei Bao
Bingzhe Wu
Pengcheng Xu
Xuehai Qian
Yun Liang
45
67
0
04 May 2021
Bring Your Own Codegen to Deep Learning Compiler
Zhi Chen
Cody Hao Yu
Trevor Morris
Jorn Tuyls
Yi-Hsiang Lai
Jared Roesch
Elliott Delaye
Vin Sharma
Yida Wang
27
14
0
03 May 2021
Tuna: A Static Analysis Approach to Optimizing Deep Neural Networks
Yao Wang
Xingyu Zhou
Yanming Wang
Rui Li
Yong Wu
Vin Sharma
37
8
0
29 Apr 2021
A Deep Learning Based Cost Model for Automatic Code Optimization
Riyadh Baghdadi
Massinissa Merouani
Mohamed-Hicham Leghettas
K. Abdous
T. Arbaoui
K. Benatchba
Saman P. Amarasinghe
35
68
0
11 Apr 2021
Joint Program and Layout Transformations to enable Convolutional Operators on Specialized Hardware based on Constraint Programming
D. Rieber
Axel Acosta
Holger Fröning
14
0
0
10 Apr 2021
Automated Backend-Aware Post-Training Quantization
Ziheng Jiang
Animesh Jain
An Liu
Josh Fromm
Chengqian Ma
Tianqi Chen
Luis Ceze
MQ
37
2
0
27 Mar 2021
MetaTune: Meta-Learning Based Cost Model for Fast and Efficient Auto-tuning Frameworks
Jaehun Ryu
Hyojin Sung
57
16
0
08 Feb 2021
Understanding Cache Boundness of ML Operators on ARM Processors
Bernhard Klein
Christoph Gratl
Manfred Mücke
Holger Fröning
MQ
19
1
0
01 Feb 2021
Previous
1
2
3
Next