Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.05496
Cited By
Flex Attention: A Programming Model for Generating Optimized Attention Kernels
7 December 2024
Juechu Dong
Boyuan Feng
Driss Guessous
Yanbo Liang
Horace He
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Flex Attention: A Programming Model for Generating Optimized Attention Kernels"
6 / 6 papers shown
Title
Training-Free Efficient Video Generation via Dynamic Token Carving
Yuechen Zhang
Jinbo Xing
Bin Xia
Shaoteng Liu
Bohao Peng
Xin Tao
Pengfei Wan
Eric Lo
Jiaya Jia
DiffM
VGen
34
0
0
22 May 2025
Scale-invariant Attention
Ben Anson
Xi Wang
Laurence Aitchison
LRM
22
0
0
20 May 2025
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Marianne Arriola
Aaron Gokaslan
Justin T Chiu
Zhihan Yang
Zhixuan Qi
Jiaqi Han
Subham Sekhar Sahoo
Volodymyr Kuleshov
DiffM
82
8
0
12 Mar 2025
Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention
Emily Xiao
Chin-Jou Li
Yilin Zhang
Graham Neubig
Amanda Bertsch
BDL
80
0
0
11 Mar 2025
AttentionEngine: A Versatile Framework for Efficient Attention Mechanisms on Diverse Hardware Platforms
Feiyang Chen
Yu Cheng
Lei Wang
Yuqing Xia
Ziming Miao
...
Fan Yang
Jinbao Xue
Zhi Yang
M. Yang
H. Chen
81
1
0
24 Feb 2025
Adaptive Self-improvement LLM Agentic System for ML Library Development
Genghan Zhang
Weixin Liang
Olivia Hsu
K. Olukotun
270
1
0
04 Feb 2025
1