Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.14135
Cited By
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
27 May 2022
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness"
12 / 1,462 papers shown
Title
Understanding Performance of Long-Document Ranking Models through Comprehensive Evaluation and Leaderboarding
Leonid Boytsov
David Akinpelu
Tianyi Lin
Fangwei Gao
Yutian Zhao
Jeffrey Huang
Nipun Katyal
Eric Nyberg
88
9
0
04 Jul 2022
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks
Tuan Dinh
Yuchen Zeng
Ruisu Zhang
Ziqian Lin
Michael Gira
Shashank Rajput
Jy-yong Sohn
Dimitris Papailiopoulos
Kangwook Lee
LMTD
71
130
0
14 Jun 2022
Multimodal Learning with Transformers: A Survey
Peng Xu
Xiatian Zhu
David Clifton
ViT
84
538
0
13 Jun 2022
Transformer Quality in Linear Time
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
86
223
0
21 Feb 2022
Self-attention Does Not Need
O
(
n
2
)
O(n^2)
O
(
n
2
)
Memory
M. Rabe
Charles Staats
LRM
43
144
0
10 Dec 2021
An Empirical Study: Extensive Deep Temporal Point Process
Haitao Lin
Cheng Tan
Lirong Wu
Zhangyang Gao
Stan. Z. Li
AI4TS
38
12
0
19 Oct 2021
Combiner: Full Attention Transformer with Sparse Computation Cost
Hongyu Ren
H. Dai
Zihang Dai
Mengjiao Yang
J. Leskovec
Dale Schuurmans
Bo Dai
87
78
0
12 Jul 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
284
180
0
17 Feb 2021
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
342
2,041
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
266
585
0
12 Mar 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
249
1,850
0
17 Sep 2019
Neural Legal Judgment Prediction in English
Ilias Chalkidis
Ion Androutsopoulos
Nikolaos Aletras
AILaw
ELM
129
327
0
05 Jun 2019
Previous
1
2
3
...
28
29
30