Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.05496
Cited By
Flex Attention: A Programming Model for Generating Optimized Attention Kernels
7 December 2024
Juechu Dong
Boyuan Feng
Driss Guessous
Yanbo Liang
Horace He
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Flex Attention: A Programming Model for Generating Optimized Attention Kernels"
3 / 3 papers shown
Title
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Marianne Arriola
Aaron Gokaslan
Justin T Chiu
Zhihan Yang
Zhixuan Qi
Jiaqi Han
Subham Sekhar Sahoo
Volodymyr Kuleshov
DiffM
82
8
0
12 Mar 2025
Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention
Emily Xiao
Chin-Jou Li
Yilin Zhang
Graham Neubig
Amanda Bertsch
BDL
80
0
0
11 Mar 2025
AttentionEngine: A Versatile Framework for Efficient Attention Mechanisms on Diverse Hardware Platforms
Feiyang Chen
Yu Cheng
Lei Wang
Yuqing Xia
Ziming Miao
...
Fan Yang
Jinbao Xue
Zhi Yang
M. Yang
H. Chen
81
1
0
24 Feb 2025
1