Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.14135
Cited By
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
27 May 2022
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness"
32 / 1,432 papers shown
Title
Modeling Multivariate Biosignals With Graph Neural Networks and Structured State Space Models
Siyi Tang
Jared A. Dunnmon
Liangqiong Qu
Khaled Kamal Saab
T. Baykaner
Christopher Lee-Messer
D. Rubin
30
21
0
21 Nov 2022
Breadth-First Pipeline Parallelism
J. Lamy-Poirier
GNN
MoE
AI4CE
28
1
0
11 Nov 2022
Efficiently Scaling Transformer Inference
Reiner Pope
Sholto Douglas
Aakanksha Chowdhery
Jacob Devlin
James Bradbury
Anselm Levskaya
Jonathan Heek
Kefan Xiao
Shivani Agrawal
J. Dean
37
295
0
09 Nov 2022
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Elias Frantar
Saleh Ashkboos
Torsten Hoefler
Dan Alistarh
MQ
33
898
0
31 Oct 2022
Inference from Real-World Sparse Measurements
Arnaud Pannatier
Kyle Matoba
F. Fleuret
AI4TS
28
0
0
20 Oct 2022
FIMP: Foundation Model-Informed Message Passing for Graph Neural Networks
S. Rizvi
Nazreen Pallikkavaliyaveetil
David Zhang
Zhuoyang Lyu
Nhi Nguyen
...
Amin Karbasi
Rex Ying
Maria Brbić
Rahul M. Dhodapkar
David van Dijk
GNN
AI4CE
21
1
0
17 Oct 2022
Token Merging: Your ViT But Faster
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Christoph Feichtenhofer
Judy Hoffman
MoMe
51
422
0
17 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Lingpeng Kong
3DV
43
9
0
14 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
31
47
0
13 Oct 2022
S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces
Eric N. D. Nguyen
Karan Goel
Albert Gu
Gordon W. Downs
Preey Shah
Tri Dao
S. Baccus
Christopher Ré
VLM
22
39
0
12 Oct 2022
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Yujia Zhai
Chengquan Jiang
Leyuan Wang
Xiaoying Jia
Shang Zhang
Zizhong Chen
Xin Liu
Yibo Zhu
62
48
0
06 Oct 2022
Dilated Neighborhood Attention Transformer
Ali Hassani
Humphrey Shi
ViT
MedIm
33
68
0
29 Sep 2022
DPNet: Dual-Path Network for Real-time Object Detection with Lightweight Attention
Quan Zhou
Huiming Shi
Wei Xiang
Bin Kang
Xiaofu Wu
Longin Jan Latecki
ObjD
22
31
0
28 Sep 2022
Hydra Attention: Efficient Attention with Many Heads
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Judy Hoffman
99
77
0
15 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
30
109
0
31 Aug 2022
Boosting Distributed Training Performance of the Unpadded BERT Model
Jinle Zeng
Min Li
Zhihua Wu
Jiaqi Liu
Yuang Liu
Dianhai Yu
Yanjun Ma
17
10
0
17 Aug 2022
Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment
Qiang Chen
Xiaokang Chen
Jian Wang
Shan Zhang
Kun Yao
Haocheng Feng
Junyu Han
Errui Ding
Gang Zeng
Jingdong Wang
ViT
49
120
0
26 Jul 2022
DETRs with Hybrid Matching
Ding Jia
Yuhui Yuan
Hao He
Xiao-pei Wu
Haojun Yu
Weihong Lin
Lei-huan Sun
Chao Zhang
Hanhua Hu
26
182
0
26 Jul 2022
Efficient High-Resolution Deep Learning: A Survey
Arian Bakhtiarnia
Qi Zhang
Alexandros Iosifidis
MedIm
21
19
0
26 Jul 2022
Vision Transformers: From Semantic Segmentation to Dense Prediction
Li Zhang
Jiachen Lu
Sixiao Zheng
Xinxuan Zhao
Xiatian Zhu
Yanwei Fu
Tao Xiang
Jianfeng Feng
Philip H. S. Torr
ViT
27
7
0
19 Jul 2022
Understanding Performance of Long-Document Ranking Models through Comprehensive Evaluation and Leaderboarding
Leonid Boytsov
David Akinpelu
Tianyi Lin
Fangwei Gao
Yutian Zhao
Jeffrey Huang
Nipun Katyal
Eric Nyberg
47
9
0
04 Jul 2022
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks
Tuan Dinh
Yuchen Zeng
Ruisu Zhang
Ziqian Lin
Michael Gira
Shashank Rajput
Jy-yong Sohn
Dimitris Papailiopoulos
Kangwook Lee
LMTD
42
127
0
14 Jun 2022
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
72
528
0
13 Jun 2022
Transformer Quality in Linear Time
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
78
222
0
21 Feb 2022
Self-attention Does Not Need
O
(
n
2
)
O(n^2)
O
(
n
2
)
Memory
M. Rabe
Charles Staats
LRM
26
139
0
10 Dec 2021
An Empirical Study: Extensive Deep Temporal Point Process
Haitao Lin
Cheng Tan
Lirong Wu
Zhangyang Gao
Stan. Z. Li
AI4TS
13
12
0
19 Oct 2021
Combiner: Full Attention Transformer with Sparse Computation Cost
Hongyu Ren
H. Dai
Zihang Dai
Mengjiao Yang
J. Leskovec
Dale Schuurmans
Bo Dai
87
77
0
12 Jul 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
281
179
0
17 Feb 2021
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
285
2,017
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
252
580
0
12 Mar 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,826
0
17 Sep 2019
Neural Legal Judgment Prediction in English
Ilias Chalkidis
Ion Androutsopoulos
Nikolaos Aletras
AILaw
ELM
123
325
0
05 Jun 2019
Previous
1
2
3
...
27
28
29