Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.14135
Cited By
v1
v2 (latest)
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
27 May 2022
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness"
50 / 1,508 papers shown
Title
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models
Shizhe Diao
Boyao Wang
Hanze Dong
Kashun Shum
Jipeng Zhang
Wei Xiong
Tong Zhang
ALM
100
66
0
21 Jun 2023
Textbooks Are All You Need
Suriya Gunasekar
Yi Zhang
J. Aneja
C. C. T. Mendes
Allison Del Giorno
...
Sébastien Bubeck
Ronen Eldan
Adam Tauman Kalai
Y. Lee
Yuan-Fang Li
AI4CE
ALM
SyDa
108
411
0
20 Jun 2023
Sparse Modular Activation for Efficient Sequence Modeling
Liliang Ren
Yang Liu
Shuohang Wang
Yichong Xu
Chenguang Zhu
Chengxiang Zhai
95
14
0
19 Jun 2023
Anticipatory Music Transformer
John Thickstun
David Leo Wright Hall
Chris Donahue
Percy Liang
77
16
0
14 Jun 2023
INT2.1: Towards Fine-Tunable Quantized Large Language Models with Error Correction through Low-Rank Adaptation
Yuji Chai
John Gkountouras
Glenn G. Ko
David Brooks
Gu-Yeon Wei
MQ
61
19
0
13 Jun 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
551
4,453
0
09 Jun 2023
Simple and Controllable Music Generation
Jade Copet
Felix Kreuk
Itai Gat
Tal Remez
David Kant
Gabriel Synnaeve
Yossi Adi
Alexandre Défossez
MGen
147
377
0
08 Jun 2023
VideoComposer: Compositional Video Synthesis with Motion Controllability
Xiang Wang
Hangjie Yuan
Shiwei Zhang
Dayou Chen
Jiuniu Wang
Yingya Zhang
Yujun Shen
Deli Zhao
Jingren Zhou
VGen
DiffM
121
341
0
03 Jun 2023
Faster Causal Attention Over Large Sequences Through Sparse Flash Attention
Matteo Pagliardini
Daniele Paliotta
Martin Jaggi
Franccois Fleuret
LRM
76
25
0
01 Jun 2023
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
177
776
0
01 Jun 2023
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
Chaitanya K. Ryali
Yuan-Ting Hu
Daniel Bolya
Chen Wei
Haoqi Fan
...
Omid Poursaeed
Judy Hoffman
Jitendra Malik
Yanghao Li
Christoph Feichtenhofer
3DH
128
188
0
01 Jun 2023
SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds
Yanyu Li
Huan Wang
Qing Jin
Ju Hu
Pavlo Chemerys
Yun Fu
Yanzhi Wang
Sergey Tulyakov
Jian Ren
VLM
132
165
0
01 Jun 2023
Coneheads: Hierarchy Aware Attention
Albert Tseng
Tao Yu
Toni J.B. Liu
Chris De Sa
3DPC
102
6
0
01 Jun 2023
Protein Design with Guided Discrete Diffusion
Nate Gruver
Samuel Stanton
Nathan C. Frey
Tim G. J. Rudner
I. Hotzel
J. Lafrance-Vanasse
A. Rajpal
Kyunghyun Cho
A. Wilson
DiffM
137
120
0
31 May 2023
Self-Verification Improves Few-Shot Clinical Information Extraction
Zelalem Gero
Chandan Singh
Hao Cheng
Tristan Naumann
Michel Galley
Jianfeng Gao
Hoifung Poon
111
60
0
30 May 2023
Blockwise Parallel Transformer for Large Context Models
Hao Liu
Pieter Abbeel
77
11
0
30 May 2023
Likelihood-Based Diffusion Language Models
Ishaan Gulrajani
Tatsunori B. Hashimoto
DiffM
114
62
0
30 May 2023
From Zero to Turbulence: Generative Modeling for 3D Flow Simulation
Marten Lienen
David Lüdke
Jan Hansen-Palmus
Stephan Günnemann
DiffM
AI4CE
200
30
0
29 May 2023
SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics
A. Ardakani
Altan Haan
Shangyin Tan
Doru-Thom Popovici
Alvin Cheung
Costin Iancu
Koushik Sen
50
4
0
29 May 2023
BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages
Wen Yang
Chong Li
Jiajun Zhang
Chengqing Zong
LRM
95
54
0
29 May 2023
Geometric Algebra Transformer
Johann Brehmer
P. D. Haan
S. Behrends
Taco S. Cohen
102
32
0
28 May 2023
Exploring the Practicality of Generative Retrieval on Dynamic Corpora
Soyoung Yoon
Chaeeun Kim
Hyunji Lee
Joel Jang
Sohee Yang
Minjoon Seo
84
5
0
27 May 2023
Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference
Zihao Yu
Haoyang Li
Fangcheng Fu
Xupeng Miao
Tengjiao Wang
DiffM
93
8
0
27 May 2023
Fine-Tuning Language Models with Just Forward Passes
Sadhika Malladi
Tianyu Gao
Eshaan Nichani
Alexandru Damian
Jason D. Lee
Danqi Chen
Sanjeev Arora
148
205
0
27 May 2023
Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time
Zichang Liu
Aditya Desai
Fangshuo Liao
Weitao Wang
Victor Xie
Zhaozhuo Xu
Anastasios Kyrillidis
Anshumali Shrivastava
84
237
0
26 May 2023
Backpack Language Models
John Hewitt
John Thickstun
Christopher D. Manning
Percy Liang
KELM
101
16
0
26 May 2023
Imitating Task and Motion Planning with Visuomotor Transformers
Murtaza Dalal
Ajay Mandlekar
Caelan Reed Garrett
Ankur Handa
Ruslan Salakhutdinov
Dieter Fox
157
57
0
25 May 2023
Landmark Attention: Random-Access Infinite Context Length for Transformers
Amirkeivan Mohtashami
Martin Jaggi
LLMAG
142
164
0
25 May 2023
Online learning of long-range dependencies
Nicolas Zucchet
Robert Meier
Simon Schug
Asier Mujika
João Sacramento
CLL
85
21
0
25 May 2023
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Sotiris Anagnostidis
Dario Pavllo
Luca Biggio
Lorenzo Noci
Aurelien Lucchi
Thomas Hofmann
101
57
0
25 May 2023
Manifold Diffusion Fields
Ahmed A. A. Elhag
Yuyang Wang
J. Susskind
Miguel Angel Bautista
DiffM
AI4CE
79
6
0
24 May 2023
Focus Your Attention (with Adaptive IIR Filters)
Shahar Lutati
Itamar Zimerman
Lior Wolf
101
10
0
24 May 2023
Just CHOP: Embarrassingly Simple LLM Compression
A. Jha
Tom Sherborne
Evan Pete Walsh
Dirk Groeneveld
Emma Strubell
Iz Beltagy
99
3
0
24 May 2023
Adapting Language Models to Compress Contexts
Alexis Chevalier
Alexander Wettig
Anirudh Ajith
Danqi Chen
LLMAG
79
191
0
24 May 2023
Dual Path Transformer with Partition Attention
Zhengkai Jiang
Liang Liu
Jiangning Zhang
Yabiao Wang
Mingang Chen
Chengjie Wang
ViT
96
2
0
24 May 2023
BinaryViT: Towards Efficient and Accurate Binary Vision Transformers
Junrui Xiao
Zhikai Li
Lianwei Yang
Qingyi Gu
MQ
ViT
95
2
0
24 May 2023
Segmented Recurrent Transformer: An Efficient Sequence-to-Sequence Model
Yinghan Long
Sayeed Shafayet Chowdhury
Kaushik Roy
104
1
0
24 May 2023
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia
Sina J. Semnani
Violet Z. Yao
He Zhang
M. Lam
KELM
AI4MH
115
81
0
23 May 2023
Training Transitive and Commutative Multimodal Transformers with LoReTTa
Manuel Tran
Yashin Dicente Cid
Amal Lahiani
Fabian J. Theis
Tingying Peng
Eldad Klaiman
54
2
0
23 May 2023
Neural Machine Translation for Code Generation
K. Dharma
Clayton T. Morrison
119
4
0
22 May 2023
A Framework for Fine-Grained Synchronization of Dependent GPU Kernels
Abhinav Jangda
Saeed Maleki
M. Dehnavi
Madan Musuvathi
Olli Saarikivi
34
5
0
22 May 2023
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
Joshua Ainslie
James Lee-Thorp
Michiel de Jong
Yury Zemlyanskiy
Federico Lebrón
Sumit Sanghai
127
705
0
22 May 2023
Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline
Zangwei Zheng
Xiaozhe Ren
Fuzhao Xue
Yang Luo
Xin Jiang
Yang You
88
64
0
22 May 2023
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
240
612
0
22 May 2023
Non-Autoregressive Document-Level Machine Translation
Guangsheng Bao
Zhiyang Teng
Hao Zhou
Jianhao Yan
Yue Zhang
73
0
0
22 May 2023
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
106
13
0
22 May 2023
Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language Models
Oana Ignat
Zhijing Jin
Artem Abzaliev
Laura Biester
Santiago Castro
...
Verónica Pérez-Rosas
Siqi Shen
Zekun Wang
Winston Wu
Rada Mihalcea
LRM
136
6
0
21 May 2023
CARD: Channel Aligned Robust Blend Transformer for Time Series Forecasting
Xue Wang
Tian Zhou
Qingsong Wen
Jinyang Gao
Bolin Ding
Rong Jin
AI4TS
83
44
0
20 May 2023
Efficient ConvBN Blocks for Transfer Learning and Beyond
Kaichao You
Guo Qin
Anchang Bao
Mengsi Cao
Ping Huang
Jiulong Shan
Mingsheng Long
66
1
0
19 May 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLM
MLLM
ObjD
151
122
0
18 May 2023
Previous
1
2
3
...
28
29
30
31
Next