Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.14135
Cited By
v1
v2 (latest)
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
27 May 2022
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness"
50 / 1,508 papers shown
Title
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?
Hailin Chen
Fangkai Jiao
Xingxuan Li
Chengwei Qin
Mathieu Ravaut
Ruochen Zhao
Caiming Xiong
Shafiq Joty
ELM
CLL
AI4MH
LRM
ALM
146
27
0
28 Nov 2023
On the Long Range Abilities of Transformers
Itamar Zimerman
Lior Wolf
82
8
0
28 Nov 2023
Fast and Efficient 2-bit LLM Inference on GPU: 2/4/16-bit in a Weight Matrix with Asynchronous Dequantization
Jinhao Li
Jiaming Xu
Shiyao Li
Shan Huang
Jun Liu
Yaoxiu Lian
Guohao Dai
MQ
59
3
0
28 Nov 2023
Swallowing the Bitter Pill: Simplified Scalable Conformer Generation
Yuyang Wang
Ahmed A. A. Elhag
Navdeep Jaitly
J. Susskind
Miguel Angel Bautista
DiffM
109
26
0
27 Nov 2023
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
Zeming Chen
Alejandro Hernández Cano
Angelika Romanou
Antoine Bonnet
Kyle Matoba
...
Axel Marmet
Syrielle Montariol
Mary-Anne Hartley
Martin Jaggi
Antoine Bosselut
LM&MA
AI4MH
MedIm
110
198
0
27 Nov 2023
vTrain: A Simulation Framework for Evaluating Cost-effective and Compute-optimal Large Language Model Training
Jehyeon Bang
Yujeong Choi
Myeongwoo Kim
Yongdeok Kim
Minsoo Rhu
64
18
0
27 Nov 2023
Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning
Huanjin Yao
Wenhao Wu
Zhiheng Li
VLM
132
10
0
27 Nov 2023
Cerbero-7B: A Leap Forward in Language-Specific LLMs Through Enhanced Chat Corpus Generation and Evaluation
Federico A. Galatolo
M. G. Cimino
77
5
0
27 Nov 2023
SwiftLearn: A Data-Efficient Training Method of Deep Learning Models using Importance Sampling
Habib Hajimolahoseini
Omar Mohamed Awad
Walid Ahmed
Austin Wen
Saina Asani
...
Farnoosh Javadi
Mehdi Ahmadi
Foozhan Ataiefard
Kangling Liu
Yang Liu
57
2
0
25 Nov 2023
One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space
Raghav Addanki
Chenyang Li
Zhao Song
Chiwun Yang
105
3
0
24 Nov 2023
PrivateLoRA For Efficient Privacy Preserving LLM
Yiming Wang
Yu Lin
Xiaodong Zeng
Guannan Zhang
105
14
0
23 Nov 2023
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey
Yunpeng Huang
Jingwei Xu
Junyu Lai
Zixu Jiang
Taolue Chen
...
Xiaoxing Ma
Lijuan Yang
Zhou Xin
Shupeng Li
Penghao Zhao
LLMAG
KELM
98
66
0
21 Nov 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
89
4
0
21 Nov 2023
Applications of Large Scale Foundation Models for Autonomous Driving
Yu Huang
Yue Chen
Zhu Li
ELM
AI4CE
LRM
ALM
LM&Ro
139
16
0
20 Nov 2023
HexGen: Generative Inference of Large Language Model over Heterogeneous Environment
Youhe Jiang
Ran Yan
Xiaozhe Yao
Yang Zhou
Beidi Chen
Binhang Yuan
SyDa
68
15
0
20 Nov 2023
MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion
Di Chang
Yichun Shi
Quankai Gao
Jessica Fu
Hongyi Xu
Guoxian Song
Qing Yan
Yizhe Zhu
Xiao Yang
Mohammad Soleymani
DiffM
VGen
102
59
0
18 Nov 2023
A Language Agent for Autonomous Driving
Jiageng Mao
Junjie Ye
Yuxi Qian
Marco Pavone
Yue Wang
LM&Ro
LRM
99
109
0
17 Nov 2023
DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines
Chenyu Jiang
Zhen Jia
Shuai Zheng
Yida Wang
Chuan Wu
MoE
AI4CE
32
8
0
17 Nov 2023
OVM, Outcome-supervised Value Models for Planning in Mathematical Reasoning
Fei Yu
Anningzhe Gao
Benyou Wang
OffRL
LRM
72
52
0
16 Nov 2023
Striped Attention: Faster Ring Attention for Causal Transformers
William Brandon
Aniruddha Nrusimha
Kevin Qian
Zack Ankner
Tian Jin
Zhiye Song
Jonathan Ragan-Kelley
63
38
0
15 Nov 2023
REST: Retrieval-Based Speculative Decoding
Zhenyu He
Zexuan Zhong
Tianle Cai
Jason D. Lee
Di He
RALM
95
91
0
14 Nov 2023
Explicit Foundation Model Optimization with Self-Attentive Feed-Forward Neural Units
Jake Ryland Williams
Haoran Zhao
124
0
0
13 Nov 2023
Towards the Law of Capacity Gap in Distilling Language Models
Chen Zhang
Dawei Song
Zheyu Ye
Yan Gao
ELM
74
21
0
13 Nov 2023
To Transformers and Beyond: Large Language Models for the Genome
Micaela Elisa Consens
Cameron Dufault
Michael Wainberg
Duncan Forster
Mehran Karimzadeh
Hani Goodarzi
Fabian J. Theis
Alan Moses
Bo Wang
LM&MA
MedIm
68
32
0
13 Nov 2023
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model
Jiahao Li
Hao Tan
Kai Zhang
Zexiang Xu
Fujun Luan
Yinghao Xu
Yicong Hong
Kalyan Sunkavalli
Greg Shakhnarovich
Sai Bi
131
275
0
10 Nov 2023
ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences
Yuanhe Tian
Ruyi Gan
Yan Song
Jiaxing Zhang
Yongdong Zhang
AI4MH
AI4CE
LM&MA
129
41
0
10 Nov 2023
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
Daniel Y. Fu
Hermann Kumbong
Eric N. D. Nguyen
Christopher Ré
VLM
100
30
0
10 Nov 2023
CFBenchmark: Chinese Financial Assistant Benchmark for Large Language Model
Yang Lei
Jiangtong Li
Dawei Cheng
Zhijun Ding
Changjun Jiang
45
11
0
10 Nov 2023
Long-Horizon Dialogue Understanding for Role Identification in the Game of Avalon with Large Language Models
Simon Stepputtis
Joseph Campbell
Yaqi Xie
Zhengyang Qi
W. Zhang
Ruiyi Wang
Sanketh Rangreji
Michael Lewis
Katia Sycara
LLMAG
71
8
0
09 Nov 2023
Window Attention is Bugged: How not to Interpolate Position Embeddings
Daniel Bolya
Chaitanya K. Ryali
Judy Hoffman
Christoph Feichtenhofer
85
13
0
09 Nov 2023
Efficient Parallelization Layouts for Large-Scale Distributed Model Training
Johannes Hagemann
Samuel Weinbach
Konstantin Dobler
Maximilian Schall
Gerard de Melo
LRM
95
8
0
09 Nov 2023
High-Performance Transformers for Table Structure Recognition Need Early Convolutions
Sheng-Hsuan Peng
Seongmin Lee
Xiaojing Wang
Rajarajeswari Balasubramaniyan
Duen Horng Chau
ViT
LMTD
46
3
0
09 Nov 2023
GPU-Accelerated WFST Beam Search Decoder for CTC-based Speech Recognition
Daniel Galvez
Tim Kaldewey
54
1
0
08 Nov 2023
Beyond Size: How Gradients Shape Pruning Decisions in Large Language Models
Rocktim Jyoti Das
Mingjie Sun
Liqun Ma
Zhiqiang Shen
VLM
79
18
0
08 Nov 2023
LongQLoRA: Efficient and Effective Method to Extend Context Length of Large Language Models
Jianxin Yang
43
6
0
08 Nov 2023
Hierarchically Gated Recurrent Neural Network for Sequence Modeling
Zhen Qin
Aaron Courville
Yiran Zhong
90
80
0
08 Nov 2023
DACBERT: Leveraging Dependency Agreement for Cost-Efficient Bert Pretraining
Martin Kuo
Jianyi Zhang
Yiran Chen
52
2
0
08 Nov 2023
Euclidean, Projective, Conformal: Choosing a Geometric Algebra for Equivariant Transformers
P. D. Haan
Taco S. Cohen
Johann Brehmer
59
10
0
08 Nov 2023
LooGLE: Can Long-Context Language Models Understand Long Contexts?
Jiaqi Li
Mengmeng Wang
Zilong Zheng
Muhan Zhang
ELM
RALM
93
134
0
08 Nov 2023
Prompt Cache: Modular Attention Reuse for Low-Latency Inference
In Gim
Guojun Chen
Seung-seob Lee
Nikhil Sarda
Anurag Khandelwal
Lin Zhong
121
88
0
07 Nov 2023
Practical Performance Guarantees for Pipelined DNN Inference
Aaron Archer
Matthew Fahrbach
Kuikui Liu
Prakash Prabhu
52
0
0
07 Nov 2023
Dissecting the Runtime Performance of the Training, Fine-tuning, and Inference of Large Language Models
Longteng Zhang
Xiang Liu
Zeyu Li
Xinglin Pan
Peijie Dong
...
Rui Guo
Xin Wang
Qiong Luo
Shaoshuai Shi
Xiaowen Chu
84
8
0
07 Nov 2023
A Foundation Model for Music Informatics
Minz Won
Yun-Ning Hung
Duc Le
110
23
0
06 Nov 2023
Ziya2: Data-centric Learning is All LLMs Need
Ruyi Gan
Ziwei Wu
Renliang Sun
Junyu Lu
Xiaojun Wu
...
Ping Yang
Qi Yang
Hao Wang
Jiaxing Zhang
Yan Song
VLM
ALM
99
19
0
06 Nov 2023
Instructed Language Models with Retrievers Are Powerful Entity Linkers
Zilin Xiao
Ming Gong
Jie Wu
Xingyao Zhang
Linjun Shou
Jian Pei
Daxin Jiang
LRM
84
12
0
06 Nov 2023
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
Sotiris Anagnostidis
Gregor Bachmann
Imanol Schlag
Thomas Hofmann
72
2
0
06 Nov 2023
PhoGPT: Generative Pre-training for Vietnamese
Dat Quoc Nguyen
L. T. Nguyen
Chi Tran
Dung Ngoc Nguyen
D.Q. Phung
Hung Bui
59
9
0
06 Nov 2023
Ultra-Long Sequence Distributed Transformer
Xiao Wang
Isaac Lyngaas
A. Tsaris
Peng Chen
Sajal Dash
Mayanka Chandra Shekar
Tao Luo
Hong-Jun Yoon
Mohamed Wahib
John P. Gounley
124
4
0
04 Nov 2023
ForecastPFN: Synthetically-Trained Zero-Shot Forecasting
Samuel Dooley
Gurnoor Singh Khurana
Chirag Mohapatra
Siddartha Naidu
Colin White
AI4TS
146
66
0
03 Nov 2023
FlashDecoding++: Faster Large Language Model Inference on GPUs
Ke Hong
Guohao Dai
Jiaming Xu
Qiuli Mao
Xiuhong Li
Jun Liu
Kangdi Chen
Yuhan Dong
Yu Wang
88
77
0
02 Nov 2023
Previous
1
2
3
...
24
25
26
...
29
30
31
Next