Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.07443
Cited By
The I/O Complexity of Attention, or How Optimal is Flash Attention?
12 February 2024
Barna Saha
Christopher Ye
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The I/O Complexity of Attention, or How Optimal is Flash Attention?"
2 / 2 papers shown
Title
Model Agnostic Hybrid Sharding For Heterogeneous Distributed Inference
Claudio Angione
Yue Zhao
Harry Yang
Ahmad Farhan
Fielding Johnston
James Buban
Patrick Colangelo
42
1
0
29 Jul 2024
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
282
2,015
0
28 Jul 2020
1