ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.02103
  4. Cited By
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning

Relax: Composable Abstractions for End-to-End Dynamic Machine Learning

1 November 2023
Ruihang Lai
Junru Shao
Siyuan Feng
Steven Lyubomirsky
Bohan Hou
Wuwei Lin
Zihao Ye
Hongyi Jin
Yuchen Jin
Jiawei Liu
Lesheng Jin
Yaxing Cai
Ziheng Jiang
Yong Wu
Sunghyun Park
Prakalp Srivastava
Jared Roesch
T. Mowry
Tianqi Chen
ArXivPDFHTML

Papers citing "Relax: Composable Abstractions for End-to-End Dynamic Machine Learning"

23 / 23 papers shown
Title
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Zihao Ye
Lequn Chen
Ruihang Lai
Wuwei Lin
Yineng Zhang
...
Tianqi Chen
Baris Kasikci
Vinod Grover
Arvind Krishnamurthy
Luis Ceze
82
24
0
02 Jan 2025
Pruning Large Language Models to Intra-module Low-rank Architecture with
  Transitional Activations
Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activations
Bowen Shen
Zheng Lin
Daren Zha
Wei Liu
Jian Luan
Bin Wang
Weiping Wang
79
1
0
08 Jul 2024
GeoT: Tensor Centric Library for Graph Neural Network via Efficient
  Segment Reduction on GPU
GeoT: Tensor Centric Library for Graph Neural Network via Efficient Segment Reduction on GPU
Zhongming Yu
Genghan Zhang
Hanxian Huang
Xin Chen
Jishen Zhao
GNN
48
0
0
03 Apr 2024
Towards Efficient Generative Large Language Model Serving: A Survey from
  Algorithms to Systems
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Xupeng Miao
Gabriele Oliaro
Zhihao Zhang
Xinhao Cheng
Hongyi Jin
Tianqi Chen
Zhihao Jia
80
78
0
23 Dec 2023
Efficient Memory Management for Large Language Model Serving with
  PagedAttention
Efficient Memory Management for Large Language Model Serving with PagedAttention
Woosuk Kwon
Zhuohan Li
Siyuan Zhuang
Ying Sheng
Lianmin Zheng
Cody Hao Yu
Joseph E. Gonzalez
Haotong Zhang
Ion Stoica
VLM
115
2,049
0
12 Sep 2023
Visual Instruction Tuning
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
345
4,607
0
17 Apr 2023
Stream-K: Work-centric Parallel Decomposition for Dense Matrix-Matrix
  Multiplication on the GPU
Stream-K: Work-centric Parallel Decomposition for Dense Matrix-Matrix Multiplication on the GPU
Muhammad Osama
D. Merrill
C. Cecka
M. Garland
John Douglas Owens
19
27
0
09 Jan 2023
SparseTIR: Composable Abstractions for Sparse Compilation in Deep
  Learning
SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning
Zihao Ye
Ruihang Lai
Junru Shao
Tianqi Chen
Luis Ceze
85
93
0
11 Jul 2022
TensorIR: An Abstraction for Automatic Tensorized Program Optimization
TensorIR: An Abstraction for Automatic Tensorized Program Optimization
Siyuan Feng
Bohan Hou
Hongyi Jin
Wuwei Lin
Junru Shao
...
Zihao Ye
Lianmin Zheng
Cody Hao Yu
Yong Yu
Tianqi Chen
34
66
0
09 Jul 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
166
2,131
0
27 May 2022
Tensor Program Optimization with Probabilistic Programs
Tensor Program Optimization with Probabilistic Programs
Junru Shao
Xiyou Zhou
Siyuan Feng
Bohan Hou
Ruihang Lai
Hongyi Jin
Wuwei Lin
Masahiro Masuda
Cody Hao Yu
Tianqi Chen
67
31
0
26 May 2022
Torch.fx: Practical Program Capture and Transformation for Deep Learning
  in Python
Torch.fx: Practical Program Capture and Transformation for Deep Learning in Python
James K. Reed
Zach DeVito
Horace He
Ansley Ussery
Jason Ansel
CLIP
38
49
0
15 Dec 2021
The CoRa Tensor Compiler: Compilation for Ragged Tensors with Minimal
  Padding
The CoRa Tensor Compiler: Compilation for Ragged Tensors with Minimal Padding
Pratik Fegade
Tianqi Chen
Phillip B. Gibbons
T. Mowry
40
29
0
19 Oct 2021
Cortex: A Compiler for Recursive Deep Learning Models
Cortex: A Compiler for Recursive Deep Learning Models
Pratik Fegade
Tianqi Chen
Phillip B. Gibbons
T. Mowry
VLM
18
28
0
02 Nov 2020
Ansor: Generating High-Performance Tensor Programs for Deep Learning
Ansor: Generating High-Performance Tensor Programs for Deep Learning
Lianmin Zheng
Chengfan Jia
Minmin Sun
Zhao Wu
Cody Hao Yu
...
Jun Yang
Danyang Zhuo
Koushik Sen
Joseph E. Gonzalez
Ion Stoica
116
391
0
11 Jun 2020
Nimble: Efficiently Compiling Dynamic Neural Networks for Model
  Inference
Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference
Haichen Shen
Jared Roesch
Zhi Chen
Wei Chen
Yong Wu
Mu Li
Vin Sharma
Zachary Tatlock
Yida Wang
36
57
0
04 Jun 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
277
42,038
0
03 Dec 2019
MIOpen: An Open Source Library For Deep Learning Primitives
MIOpen: An Open Source Library For Deep Learning Primitives
Jehandad Khan
Paul Fultz
Artem Tamazov
Daniel Lowell
Chao-Jung Liu
...
Vasilii Filippov
Jing Zhang
Jing Zhou
Bragadeesh Natarajan
Mayank Daga
VLM
MoE
24
38
0
30 Sep 2019
Relay: A New IR for Machine Learning Frameworks
Relay: A New IR for Machine Learning Frameworks
Jared Roesch
Steven Lyubomirsky
Logan Weber
Josh Pollock
Marisa Kirisame
Tianqi Chen
Zachary Tatlock
48
105
0
26 Sep 2018
Learning to Optimize Tensor Programs
Learning to Optimize Tensor Programs
Tianqi Chen
Lianmin Zheng
Eddie Q. Yan
Ziheng Jiang
T. Moreau
Luis Ceze
Carlos Guestrin
Arvind Krishnamurthy
61
396
0
21 May 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
453
129,831
0
12 Jun 2017
TensorFlow: A system for large-scale machine learning
TensorFlow: A system for large-scale machine learning
Martín Abadi
P. Barham
Jianmin Chen
Zhiwen Chen
Andy Davis
...
Vijay Vasudevan
Pete Warden
Martin Wicke
Yuan Yu
Xiaoqiang Zhang
GNN
AI4CE
338
18,300
0
27 May 2016
cuDNN: Efficient Primitives for Deep Learning
cuDNN: Efficient Primitives for Deep Learning
Sharan Chetlur
Cliff Woolley
Philippe Vandermersch
Jonathan M. Cohen
J. Tran
Bryan Catanzaro
Evan Shelhamer
95
1,844
0
03 Oct 2014
1