Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.14135
Cited By
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
27 May 2022
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness"
50 / 1,436 papers shown
Title
Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning
Huanjin Yao
Wenhao Wu
Zhiheng Li
VLM
95
9
0
27 Nov 2023
Cerbero-7B: A Leap Forward in Language-Specific LLMs Through Enhanced Chat Corpus Generation and Evaluation
Federico A. Galatolo
M. G. Cimino
43
5
0
27 Nov 2023
SwiftLearn: A Data-Efficient Training Method of Deep Learning Models using Importance Sampling
Habib Hajimolahoseini
Omar Mohamed Awad
Walid Ahmed
Austin Wen
Saina Asani
...
Farnoosh Javadi
Mehdi Ahmadi
Foozhan Ataiefard
Kangling Liu
Yang Liu
29
2
0
25 Nov 2023
One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space
Raghav Addanki
Chenyang Li
Zhao Song
Chiwun Yang
50
3
0
24 Nov 2023
PrivateLoRA For Efficient Privacy Preserving LLM
Yiming Wang
Yu Lin
Xiaodong Zeng
Guannan Zhang
66
11
0
23 Nov 2023
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey
Yunpeng Huang
Jingwei Xu
Junyu Lai
Zixu Jiang
Taolue Chen
...
Xiaoxing Ma
Lijuan Yang
Zhou Xin
Shupeng Li
Penghao Zhao
LLMAG
KELM
44
56
0
21 Nov 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
43
4
0
21 Nov 2023
Applications of Large Scale Foundation Models for Autonomous Driving
Yu Huang
Yue Chen
Zhu Li
ELM
AI4CE
LRM
ALM
LM&Ro
61
15
0
20 Nov 2023
HexGen: Generative Inference of Large Language Model over Heterogeneous Environment
Youhe Jiang
Ran Yan
Xiaozhe Yao
Yang Zhou
Beidi Chen
Binhang Yuan
SyDa
30
10
0
20 Nov 2023
MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion
Di Chang
Yichun Shi
Quankai Gao
Jessica Fu
Hongyi Xu
Guoxian Song
Qing Yan
Yizhe Zhu
Xiao Yang
Mohammad Soleymani
DiffM
VGen
22
50
0
18 Nov 2023
A Language Agent for Autonomous Driving
Jiageng Mao
Junjie Ye
Yuxi Qian
Marco Pavone
Yue Wang
LM&Ro
LRM
23
91
0
17 Nov 2023
DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines
Chenyu Jiang
Zhen Jia
Shuai Zheng
Yida Wang
Chuan Wu
MoE
AI4CE
25
8
0
17 Nov 2023
OVM, Outcome-supervised Value Models for Planning in Mathematical Reasoning
Fei Yu
Anningzhe Gao
Benyou Wang
OffRL
LRM
17
43
0
16 Nov 2023
Striped Attention: Faster Ring Attention for Causal Transformers
William Brandon
Aniruddha Nrusimha
Kevin Qian
Zack Ankner
Tian Jin
Zhiye Song
Jonathan Ragan-Kelley
24
36
0
15 Nov 2023
Never Lost in the Middle: Improving Large Language Models via Attention Strengthening Question Answering
Junqing He
Kunhao Pan
Xiaoqun Dong
Zhuoyang Song
LiuYiBo LiuYiBo
...
Hao Wang
Qianguosun Qianguosun
Enming Zhang
Zejian Xie
Jiaxing Zhang
KELM
RALM
36
15
0
15 Nov 2023
REST: Retrieval-Based Speculative Decoding
Zhenyu He
Zexuan Zhong
Tianle Cai
Jason D. Lee
Di He
RALM
28
80
0
14 Nov 2023
Explicit Foundation Model Optimization with Self-Attentive Feed-Forward Neural Units
Jake Ryland Williams
Haoran Zhao
21
0
0
13 Nov 2023
Towards the Law of Capacity Gap in Distilling Language Models
Chen Zhang
Dawei Song
Zheyu Ye
Yan Gao
ELM
38
20
0
13 Nov 2023
To Transformers and Beyond: Large Language Models for the Genome
Micaela Elisa Consens
Cameron Dufault
Michael Wainberg
Duncan Forster
Mehran Karimzadeh
Hani Goodarzi
Fabian J. Theis
Alan Moses
Bo Wang
LM&MA
MedIm
26
28
0
13 Nov 2023
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model
Jiahao Li
Hao Tan
Kai Zhang
Zexiang Xu
Fujun Luan
Yinghao Xu
Yicong Hong
Kalyan Sunkavalli
Greg Shakhnarovich
Sai Bi
59
254
0
10 Nov 2023
ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences
Yuanhe Tian
Ruyi Gan
Yan Song
Jiaxing Zhang
Yongdong Zhang
AI4MH
AI4CE
LM&MA
27
31
0
10 Nov 2023
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
Daniel Y. Fu
Hermann Kumbong
Eric N. D. Nguyen
Christopher Ré
VLM
41
29
0
10 Nov 2023
CFBenchmark: Chinese Financial Assistant Benchmark for Large Language Model
Yang Lei
Jiangtong Li
Dawei Cheng
Zhijun Ding
Changjun Jiang
26
10
0
10 Nov 2023
Long-Horizon Dialogue Understanding for Role Identification in the Game of Avalon with Large Language Models
Simon Stepputtis
Joseph Campbell
Yaqi Xie
Zhengyang Qi
W. Zhang
Ruiyi Wang
Sanketh Rangreji
Michael Lewis
Katia P. Sycara
LLMAG
32
8
0
09 Nov 2023
Window Attention is Bugged: How not to Interpolate Position Embeddings
Daniel Bolya
Chaitanya K. Ryali
Judy Hoffman
Christoph Feichtenhofer
48
10
0
09 Nov 2023
Efficient Parallelization Layouts for Large-Scale Distributed Model Training
Johannes Hagemann
Samuel Weinbach
Konstantin Dobler
Maximilian Schall
Gerard de Melo
LRM
42
6
0
09 Nov 2023
High-Performance Transformers for Table Structure Recognition Need Early Convolutions
Sheng-Hsuan Peng
Seongmin Lee
Xiaojing Wang
Rajarajeswari Balasubramaniyan
Duen Horng Chau
ViT
LMTD
24
3
0
09 Nov 2023
GPU-Accelerated WFST Beam Search Decoder for CTC-based Speech Recognition
Daniel Galvez
Tim Kaldewey
23
1
0
08 Nov 2023
Beyond Size: How Gradients Shape Pruning Decisions in Large Language Models
Rocktim Jyoti Das
Mingjie Sun
Liqun Ma
Zhiqiang Shen
VLM
20
13
0
08 Nov 2023
LongQLoRA: Efficient and Effective Method to Extend Context Length of Large Language Models
Jianxin Yang
22
6
0
08 Nov 2023
Hierarchically Gated Recurrent Neural Network for Sequence Modeling
Zhen Qin
Aaron Courville
Yiran Zhong
36
74
0
08 Nov 2023
DACBERT: Leveraging Dependency Agreement for Cost-Efficient Bert Pretraining
Martin Kuo
Jianyi Zhang
Yiran Chen
27
2
0
08 Nov 2023
Euclidean, Projective, Conformal: Choosing a Geometric Algebra for Equivariant Transformers
P. D. Haan
Taco S. Cohen
Johann Brehmer
35
9
0
08 Nov 2023
LooGLE: Can Long-Context Language Models Understand Long Contexts?
Jiaqi Li
Mengmeng Wang
Zilong Zheng
Muhan Zhang
ELM
RALM
40
107
0
08 Nov 2023
Prompt Cache: Modular Attention Reuse for Low-Latency Inference
In Gim
Guojun Chen
Seung-seob Lee
Nikhil Sarda
Anurag Khandelwal
Lin Zhong
42
77
0
07 Nov 2023
Practical Performance Guarantees for Pipelined DNN Inference
Aaron Archer
Matthew Fahrbach
Kuikui Liu
Prakash Prabhu
29
0
0
07 Nov 2023
Dissecting the Runtime Performance of the Training, Fine-tuning, and Inference of Large Language Models
Longteng Zhang
Xiang Liu
Zeyu Li
Xinglin Pan
Peijie Dong
...
Rui Guo
Xin Wang
Qiong Luo
S. Shi
Xiaowen Chu
49
7
0
07 Nov 2023
A Foundation Model for Music Informatics
Minz Won
Yun-Ning Hung
Duc Le
66
20
0
06 Nov 2023
Ziya2: Data-centric Learning is All LLMs Need
Ruyi Gan
Ziwei Wu
Renliang Sun
Junyu Lu
Xiaojun Wu
...
Ping Yang
Qi Yang
Hao Wang
Jiaxing Zhang
Yan Song
VLM
ALM
23
17
0
06 Nov 2023
Instructed Language Models with Retrievers Are Powerful Entity Linkers
Zilin Xiao
Ming Gong
Jie Wu
Xingyao Zhang
Linjun Shou
Jian Pei
Daxin Jiang
LRM
32
12
0
06 Nov 2023
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
Sotiris Anagnostidis
Gregor Bachmann
Imanol Schlag
Thomas Hofmann
33
2
0
06 Nov 2023
PhoGPT: Generative Pre-training for Vietnamese
Dat Quoc Nguyen
L. T. Nguyen
Chi Tran
Dung Ngoc Nguyen
D.Q. Phung
Hung Bui
28
9
0
06 Nov 2023
Ultra-Long Sequence Distributed Transformer
Xiao Wang
Isaac Lyngaas
A. Tsaris
Peng Chen
Sajal Dash
Mayanka Chandra Shekar
Tao Luo
Hong-Jun Yoon
M. Wahib
John P. Gounley
35
4
0
04 Nov 2023
ForecastPFN: Synthetically-Trained Zero-Shot Forecasting
Samuel Dooley
Gurnoor Singh Khurana
Chirag Mohapatra
Siddartha Naidu
Colin White
AI4TS
89
60
0
03 Nov 2023
FlashDecoding++: Faster Large Language Model Inference on GPUs
Ke Hong
Guohao Dai
Jiaming Xu
Qiuli Mao
Xiuhong Li
Jun Liu
Kangdi Chen
Yuhan Dong
Yu Wang
21
70
0
02 Nov 2023
Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Alexander I. Rudnicky
22
4
0
01 Nov 2023
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
VLM
27
52
0
01 Nov 2023
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
Ruihang Lai
Junru Shao
Siyuan Feng
Steven Lyubomirsky
Bohan Hou
...
Sunghyun Park
Prakalp Srivastava
Jared Roesch
T. Mowry
Tianqi Chen
47
9
0
01 Nov 2023
ChipNeMo: Domain-Adapted LLMs for Chip Design
Mingjie Liu
Teodor-Dumitru Ene
Robert M. Kirby
Chris Cheng
N. Pinckney
...
Pratik P Suthar
Varun Tej
Walker J. Turner
Kaizhe Xu
Haoxin Ren
53
146
0
31 Oct 2023
HyPE: Attention with Hyperbolic Biases for Relative Positional Encoding
Giorgio Angelotti
16
0
0
30 Oct 2023
Previous
1
2
3
...
22
23
24
...
27
28
29
Next