Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1805.08166
Cited By
Learning to Optimize Tensor Programs
21 May 2018
Tianqi Chen
Lianmin Zheng
Eddie Q. Yan
Ziheng Jiang
T. Moreau
Luis Ceze
Carlos Guestrin
Arvind Krishnamurthy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to Optimize Tensor Programs"
50 / 145 papers shown
Title
JIR-Arena: The First Benchmark Dataset for Just-in-time Information Recommendation
Ke Yang
Kevin Ros
Shankar Kumar Senthil Kumar
ChengXiang Zhai
9
0
0
19 May 2025
Improving Assembly Code Performance with Large Language Models via Reinforcement Learning
Anjiang Wei
Tarun Suresh
Huanmi Tan
Yinglun Xu
Gagandeep Singh
Alex Aiken
Alex Aiken
14
0
0
16 May 2025
QiMeng-TensorOp: Automatically Generating High-Performance Tensor Operators with Hardware Primitives
X. Zhang
Shaohui Peng
Qirui Zhou
Yuanbo Wen
Qi Guo
...
Ke Gao
Chen Zhao
Yanjun Wu
Yunji Chen
Ling Li
VLM
39
0
0
08 May 2025
Data-efficient Performance Modeling via Pre-training
Chunting Liu
Riyadh Baghdadi
57
0
0
24 Jan 2025
MAS-Attention: Memory-Aware Stream Processing for Attention Acceleration on Resource-Constrained Edge Devices
Mohammadali Shakerdargah
Shan Lu
Chao Gao
Di Niu
79
0
0
20 Nov 2024
LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
Mufei Li
Viraj Shitole
Eli Chien
Changhai Man
Zhaodong Wang
Srinivas Sridharan
Ying Zhang
Tushar Krishna
P. Li
53
0
0
04 Nov 2024
Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments
Yuzhe Yang
Yipeng Du
Ahmad Farhan
Claudio Angione
Yue Zhao
Harry Yang
Fielding Johnston
James Buban
Patrick Colangelo
39
0
0
28 Oct 2024
Scheduling Languages: A Past, Present, and Future Taxonomy
Mary Hall
Cosmin Oancea
Anne C. Elster
Ari Rasch
Sameeran Joshi
Amir Mohammad Tavakkoli
Richard Schulze
40
1
0
25 Oct 2024
A Benchmark on Directed Graph Representation Learning in Hardware Designs
Haoyu Wang
Yinan Huang
Nan Wu
Pan Li
OOD
54
1
0
09 Oct 2024
Vortex: Efficient Sample-Free Dynamic Tensor Program Optimization via Hardware-aware Strategy Space Hierarchization
Yangjie Zhou
Honglin Zhu
Qian Qiu
Weihao Cui
Zihan Liu
...
Jintao Meng
Haidong Lan
Jingwen Leng
Wenxi Zhu
Minwen Deng
44
0
0
02 Sep 2024
Efficient LLM Scheduling by Learning to Rank
Yichao Fu
Siqi Zhu
Runlong Su
Aurick Qiao
Ion Stoica
Hao Zhang
58
19
0
28 Aug 2024
Efficient Edge AI: Deploying Convolutional Neural Networks on FPGA with the Gemmini Accelerator
Federico Nicolás Peccia
Svetlana Pavlitska
Tobias Fleck
Oliver Bringmann
30
0
0
14 Aug 2024
Combining Neural Architecture Search and Automatic Code Optimization: A Survey
Inas Bachiri
Hadjer Benmeziane
Smail Niar
Riyadh Baghdadi
Hamza Ouarnoughi
Abdelkrime Aries
47
0
0
07 Aug 2024
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge
Jianyu Wei
Shijie Cao
Ting Cao
Lingxiao Ma
Lei Wang
Yanyong Zhang
Mao Yang
MQ
53
11
0
25 Jun 2024
Optimal Kernel Orchestration for Tensor Programs with Korch
Muyan Hu
Ashwin Venkatram
Shreyashri Biswas
Balamurugan Marimuthu
Bohan Hou
Gabriele Oliaro
Haojie Wang
Liyan Zheng
Xupeng Miao
Jidong Zhai
218
4
0
13 Jun 2024
Graph neural networks with configuration cross-attention for tensor compilers
Dmitrii Khizbullin
Eduardo Rocha de Andrade
Thanh Hau Nguyen
Matheus Pedroza Ferreira
David R. Pugh
GNN
26
0
0
26 May 2024
Allo: A Programming Model for Composable Accelerator Design
Hongzheng Chen
Niansong Zhang
Shaojie Xiang
Zhichen Zeng
Mengjia Dai
Zhiru Zhang
56
14
0
07 Apr 2024
GeoT: Tensor Centric Library for Graph Neural Network via Efficient Segment Reduction on GPU
Zhongming Yu
Genghan Zhang
Hanxian Huang
Xin Chen
Jishen Zhao
GNN
29
0
0
03 Apr 2024
Tiny Machine Learning: Progress and Futures
Ji Lin
Ligeng Zhu
Wei-Ming Chen
Wei-Chen Wang
Song Han
57
51
0
28 Mar 2024
Accelerating String-Key Learned Index Structures via Memoization-based Incremental Training
Minsu Kim
Jinwoo Hwang
Guseul Heo
Seiyeon Cho
Divya Mahajan
Jongse Park
37
2
0
18 Mar 2024
LOOPer: A Learned Automatic Code Optimizer For Polyhedral Compilers
Massinissa Merouani
Khaled Afif Boudaoud
Iheb Nassim Aouadj
Nassim Tchoulak
Islam Kara Bernou
Hamza Benyamina
F. B. Tayeb
K. Benatchba
Hugh Leather
Riyadh Baghdadi
48
2
0
18 Mar 2024
DLAS: An Exploration and Assessment of the Deep Learning Acceleration Stack
Perry Gibson
José Cano
Elliot J. Crowley
Amos Storkey
Michael F. P. O'Boyle
27
1
0
15 Nov 2023
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
Ruihang Lai
Junru Shao
Siyuan Feng
Steven Lyubomirsky
Bohan Hou
...
Sunghyun Park
Prakalp Srivastava
Jared Roesch
T. Mowry
Tianqi Chen
47
9
0
01 Nov 2023
Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM
Guillermo Alaejos
Adrián Castelló
P. Alonso-Jordá
Francisco D. Igual
Héctor J. Martínez
Enrique S. Quintana-Ortí
24
2
0
31 Oct 2023
Tackling the Matrix Multiplication Micro-kernel Generation with Exo
Adrián Castelló
Julian Bellavita
Grace Dinh
Yuka Ikarashi
Héctor J. Martínez
8
4
0
26 Oct 2023
AdaMEC: Towards a Context-Adaptive and Dynamically-Combinable DNN Deployment Framework for Mobile Edge Computing
Bowen Pang
Sicong Liu
Hongli Wang
Bin Guo
Yuzhan Wang
Hao Wang
Zhenli Sheng
Zhongyi Wang
Zhiwen Yu
27
2
0
25 Oct 2023
Supersonic: Learning to Generate Source Code Optimizations in C/C++
Zimin Chen
Sen Fang
Monperrus Martin
50
11
0
26 Sep 2023
Compilation as a Defense: Enhancing DL Model Attack Robustness via Tensor Optimization
Stefan Trawicki
William Hackett
Lewis Birch
M. Dascalu
Peter Garraghan
AAML
32
0
0
20 Sep 2023
DeepliteRT: Computer Vision at the Edge
Saad Ashfaq
Alexander Hoffman
Saptarshi Mitra
Sudhakar Sah
Mohammadhossein Askarihemmat
Ehsan Saboori
VLM
MQ
37
0
0
19 Sep 2023
Autotuning Apache TVM-based Scientific Applications Using Bayesian Optimization
Xingfu Wu
P. Paramasivam
Valerie Taylor
23
3
0
13 Sep 2023
LoopTune: Optimizing Tensor Computations with Reinforcement Learning
Dejan Grubisic
Bram Wasti
Chris Cummins
John Mellor-Crummey
A. Zlateski
27
0
0
04 Sep 2023
Target-independent XLA optimization using Reinforcement Learning
Milan Ganai
Haichen Li
Theodore Enns
Yida Wang
Randy Huang
44
0
0
28 Aug 2023
TpuGraphs: A Performance Prediction Dataset on Large Tensor Computational Graphs
P. Phothilimthana
Sami Abu-El-Haija
Kaidi Cao
Bahare Fatemi
Mike Burrows
Charith Mendis
Bryan Perozzi
GNN
AI4TS
35
17
0
25 Aug 2023
Tango: rethinking quantization for graph neural network training on GPUs
Shiyang Chen
Da Zheng
Caiwen Ding
Chengying Huan
Yuede Ji
Hang Liu
GNN
MQ
36
5
0
02 Aug 2023
KAPLA: Pragmatic Representation and Fast Solving of Scalable NN Accelerator Dataflow
Zhiyao Li
Mingyu Gao
29
1
0
09 Jun 2023
Incremental Randomized Smoothing Certification
Shubham Ugare
Tarun Suresh
Debangshu Banerjee
Gagandeep Singh
Sasa Misailovic
AAML
40
8
0
31 May 2023
Learning Large Graph Property Prediction via Graph Segment Training
Kaidi Cao
P. Phothilimthana
Sami Abu-El-Haija
Dustin Zelle
Yanqi Zhou
Charith Mendis
J. Leskovec
Bryan Perozzi
25
9
0
21 May 2023
Optimizing Memory Mapping Using Deep Reinforcement Learning
Pengming Wang
Mikita Sazanovich
Berkin Ilbeyi
P. Phothilimthana
Manish Purohit
...
R. Tung
Paula Kurylowicz
Kieran Milan
Oriol Vinyals
D. Mankowitz
22
4
0
11 May 2023
Canvas: End-to-End Kernel Architecture Search in Neural Networks
Chenggang Zhao
Genghan Zhang
Mingyu Gao
28
1
0
16 Apr 2023
STen: Productive and Efficient Sparsity in PyTorch
Andrei Ivanov
Nikoli Dryden
Tal Ben-Nun
Saleh Ashkboos
Torsten Hoefler
39
4
0
15 Apr 2023
Transfer Learning Across Heterogeneous Features For Efficient Tensor Program Generation
Gaurav Verma
Siddhisanket Raskar
Zhenda Xie
A. Malik
M. Emani
Barbara M. Chapman
40
2
0
11 Apr 2023
Performance Embeddings: A Similarity-based Approach to Automatic Performance Optimization
Lukas Trumper
Tal Ben-Nun
Philipp Schaad
A. Calotoiu
Torsten Hoefler
50
0
0
14 Mar 2023
Slapo: A Schedule Language for Progressive Optimization of Large Deep Learning Model Training
Hongzheng Chen
Cody Hao Yu
Shuai Zheng
Zhen Zhang
Zhiru Zhang
Yida Wang
33
6
0
16 Feb 2023
ML-driven Hardware Cost Model for MLIR
Dibyendu Das
Sandya Mannarswamy
29
0
0
14 Feb 2023
CMLCompiler: A Unified Compiler for Classical Machine Learning
Xu Wen
Wanling Gao
An-Dong Li
Lei Wang
Zihan Jiang
Jianfeng Zhan
28
0
0
31 Jan 2023
oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning Compilation
Jianhui Li
Zhennan Qin
Yijie Mei
Jingze Cui
Yunfei Song
...
Baihui Jin
Yan Zhang
Jason Ye
Eric Lin
Daniel M. Lavery
GNN
22
8
0
03 Jan 2023
Towards Hardware-Specific Automatic Compression of Neural Networks
Torben Krieger
Bernhard Klein
Holger Fröning
MQ
32
2
0
15 Dec 2022
Integration of a systolic array based hardware accelerator into a DNN operator auto-tuning framework
Federico Nicolás Peccia
Oliver Bringmann
19
5
0
06 Dec 2022
AGO: Boosting Mobile AI Inference Performance by Removing Constraints on Graph Optimization
Zhiying Xu
H. Peng
Wei Wang
GNN
31
3
0
02 Dec 2022
HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler for Neural Networks
Zining Zhang
Bingsheng He
Zhenjie Zhang
14
5
0
21 Nov 2022
1
2
3
Next