ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.14135
  4. Cited By
FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

27 May 2022
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
    VLM
ArXivPDFHTML

Papers citing "FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness"

50 / 1,443 papers shown
Title
Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference
Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference
Jiaming Tang
Yilong Zhao
Kan Zhu
Guangxuan Xiao
Baris Kasikci
Song Han
46
78
0
16 Jun 2024
UniZero: Generalized and Efficient Planning with Scalable Latent World Models
UniZero: Generalized and Efficient Planning with Scalable Latent World Models
Yuan Pu
Yazhe Niu
Jiyuan Ren
Zhenjie Yang
Hongsheng Li
Yu Liu
OffRL
64
1
0
15 Jun 2024
BEACON: Benchmark for Comprehensive RNA Tasks and Language Models
BEACON: Benchmark for Comprehensive RNA Tasks and Language Models
Yuchen Ren
Zhiyuan Chen
Lifeng Qiao
Hongtai Jing
Yuchen Cai
...
Siqi Sun
Hongliang Yan
Dong Yuan
Wanli Ouyang
Xihui Liu
52
9
0
14 Jun 2024
Diffusion Synthesizer for Efficient Multilingual Speech to Speech
  Translation
Diffusion Synthesizer for Efficient Multilingual Speech to Speech Translation
Nameer Hirschkind
Xiao Yu
Mahesh Kumar Nandwana
Joseph Liu
Eloi DuBois
...
Colin Sinclair
Kyle Spence
Charles Shang
Zoë Abrams
Morgan McGuire
47
0
0
14 Jun 2024
Towards Scalable and Versatile Weight Space Learning
Towards Scalable and Versatile Weight Space Learning
Konstantin Schurholt
Michael W. Mahoney
Damian Borth
55
16
0
14 Jun 2024
GEB-1.3B: Open Lightweight Large Language Model
GEB-1.3B: Open Lightweight Large Language Model
Jie Wu
Yufeng Zhu
Lei Shen
Xuqing Lu
ALM
37
0
0
14 Jun 2024
Cross-Modal Learning for Anomaly Detection in Fused Magnesium Smelting
  Process: Methodology and Benchmark
Cross-Modal Learning for Anomaly Detection in Fused Magnesium Smelting Process: Methodology and Benchmark
Gaochang Wu
Yapeng Zhang
Lan Deng
Jingxin Zhang
Tianyou Chai
43
6
0
13 Jun 2024
Optimal Kernel Orchestration for Tensor Programs with Korch
Optimal Kernel Orchestration for Tensor Programs with Korch
Muyan Hu
Ashwin Venkatram
Shreyashri Biswas
Balamurugan Marimuthu
Bohan Hou
Gabriele Oliaro
Haojie Wang
Liyan Zheng
Xupeng Miao
Jidong Zhai
266
4
0
13 Jun 2024
Multimodal Table Understanding
Multimodal Table Understanding
Mingyu Zheng
Xinwei Feng
Q. Si
Qiaoqiao She
Zheng Lin
Wenbin Jiang
Weiping Wang
LMTD
VLM
62
14
0
12 Jun 2024
Sustainable self-supervised learning for speech representations
Sustainable self-supervised learning for speech representations
Luis Lugo
Valentin Vielzeuf
55
2
0
11 Jun 2024
QuickLLaMA: Query-aware Inference Acceleration for Large Language Models
QuickLLaMA: Query-aware Inference Acceleration for Large Language Models
Jingyao Li
Han Shi
Xin Jiang
Zhenguo Li
Hong Xu
Jiaya Jia
LRM
41
2
0
11 Jun 2024
Markov Constraint as Large Language Model Surrogate
Markov Constraint as Large Language Model Surrogate
Alexandre Bonlarron
Jean-Charles Régin
39
1
0
11 Jun 2024
When Linear Attention Meets Autoregressive Decoding: Towards More
  Effective and Efficient Linearized Large Language Models
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
Haoran You
Yichao Fu
Zheng Wang
Amir Yazdanbakhsh
Yingyan Celine Lin
71
2
0
11 Jun 2024
Needle In A Multimodal Haystack
Needle In A Multimodal Haystack
Weiyun Wang
Shuibo Zhang
Yiming Ren
Yuchen Duan
Tiantong Li
...
Ping Luo
Yu Qiao
Jifeng Dai
Wenqi Shao
Wenhai Wang
VLM
61
17
0
11 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
79
57
0
11 Jun 2024
Autoregressive Model Beats Diffusion: Llama for Scalable Image
  Generation
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Peize Sun
Yi Jiang
Shoufa Chen
Shilong Zhang
Bingyue Peng
Ping Luo
Zehuan Yuan
VLM
70
239
0
10 Jun 2024
Symmetric Dot-Product Attention for Efficient Training of BERT Language
  Models
Symmetric Dot-Product Attention for Efficient Training of BERT Language Models
Martin Courtois
Malte Ostendorff
Leonhard Hennig
Georg Rehm
49
2
0
10 Jun 2024
DualAD: Disentangling the Dynamic and Static World for End-to-End
  Driving
DualAD: Disentangling the Dynamic and Static World for End-to-End Driving
Simon Doll
Niklas Hanselmann
Lukas Schneider
Richard Schulz
Marius Cordts
Markus Enzweiler
Hendrik P. A. Lensch
43
6
0
10 Jun 2024
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training
  Multiplication-Less Reparameterization
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Haoran You
Yipin Guo
Yichao Fu
Wei Zhou
Huihong Shi
Xiaofan Zhang
Souvik Kundu
Amir Yazdanbakhsh
Y. Lin
KELM
59
8
0
10 Jun 2024
SinkLoRA: Enhanced Efficiency and Chat Capabilities for Long-Context
  Large Language Models
SinkLoRA: Enhanced Efficiency and Chat Capabilities for Long-Context Large Language Models
Hengyu Zhang
RALM
52
2
0
09 Jun 2024
MSAGPT: Neural Prompting Protein Structure Prediction via MSA Generative
  Pre-Training
MSAGPT: Neural Prompting Protein Structure Prediction via MSA Generative Pre-Training
Bo Chen
Zhilei Bei
Xingyi Cheng
Pan Li
Jie Tang
Le Song
45
4
0
08 Jun 2024
Beyond Efficiency: Scaling AI Sustainably
Beyond Efficiency: Scaling AI Sustainably
Carole-Jean Wu
Bilge Acun
Ramya Raghavendra
Kim Hazelwood
GNN
54
15
0
08 Jun 2024
Enabling Efficient Batch Serving for LMaaS via Generation Length
  Prediction
Enabling Efficient Batch Serving for LMaaS via Generation Length Prediction
Ke Cheng
Wen Hu
Zhi Wang
Peng Du
Jianguo Li
Sheng Zhang
59
10
0
07 Jun 2024
LLM-based speaker diarization correction: A generalizable approach
LLM-based speaker diarization correction: A generalizable approach
Georgios Efstathiadis
Vijay Yadav
Anzar Abbas
62
3
0
07 Jun 2024
Proofread: Fixes All Errors with One Tap
Proofread: Fixes All Errors with One Tap
Renjie Liu
Yanxiang Zhang
Yun Zhu
Haicheng Sun
Yuanbo Zhang
Michael Xuelin Huang
Shanqing Cai
Lei Meng
Shumin Zhai
ALM
38
2
0
06 Jun 2024
Small-E: Small Language Model with Linear Attention for Efficient Speech
  Synthesis
Small-E: Small Language Model with Linear Attention for Efficient Speech Synthesis
Théodor Lemerle
Nicolas Obin
Axel Roebel
47
6
0
06 Jun 2024
Pointer-Guided Pre-Training: Infusing Large Language Models with
  Paragraph-Level Contextual Awareness
Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness
L. Hillebrand
Prabhupad Pradhan
Christian Bauckhage
R. Sifa
26
1
0
06 Jun 2024
BindGPT: A Scalable Framework for 3D Molecular Design via Language
  Modeling and Reinforcement Learning
BindGPT: A Scalable Framework for 3D Molecular Design via Language Modeling and Reinforcement Learning
Artem Zholus
Maksim Kuznetsov
Roman Schutski
Rim Shayakhmetov
Daniil Polykovskiy
Sarath Chandar
Alex Zhavoronkov
DiffM
AI4CE
45
6
0
06 Jun 2024
OCCAM: Towards Cost-Efficient and Accuracy-Aware Classification Inference
OCCAM: Towards Cost-Efficient and Accuracy-Aware Classification Inference
Dujian Ding
Bicheng Xu
L. Lakshmanan
VLM
51
1
0
06 Jun 2024
Seq1F1B: Efficient Sequence-Level Pipeline Parallelism for Large
  Language Model Training
Seq1F1B: Efficient Sequence-Level Pipeline Parallelism for Large Language Model Training
Ao Sun
Weilin Zhao
Xu Han
Cheng Yang
Zhiyuan Liu
Chuan Shi
Maosong Sun
36
7
0
05 Jun 2024
FILS: Self-Supervised Video Feature Prediction In Semantic Language
  Space
FILS: Self-Supervised Video Feature Prediction In Semantic Language Space
Mona Ahmadian
Frank Guerin
Andrew Gilbert
74
1
0
05 Jun 2024
Training of Physical Neural Networks
Training of Physical Neural Networks
Ali Momeni
Babak Rahmani
B. Scellier
Logan G. Wright
Peter L. McMahon
...
Julie Grollier
Andrea J. Liu
D. Psaltis
Andrea Alù
Romain Fleury
PINN
AI4CE
62
10
0
05 Jun 2024
Llumnix: Dynamic Scheduling for Large Language Model Serving
Llumnix: Dynamic Scheduling for Large Language Model Serving
Biao Sun
Ziming Huang
Hanyu Zhao
Wencong Xiao
Xinyi Zhang
Yong Li
Wei Lin
43
47
0
05 Jun 2024
Balancing Performance and Efficiency in Zero-shot Robotic Navigation
Balancing Performance and Efficiency in Zero-shot Robotic Navigation
Dmytro Kuzmenko
N. Shvai
LM&Ro
52
0
0
05 Jun 2024
Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal
  Learning
Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning
Alex Jinpeng Wang
Linjie Li
Yiqi Lin
Min Li
Lijuan Wang
Mike Zheng Shou
VLM
56
3
0
04 Jun 2024
Loki: Low-Rank Keys for Efficient Sparse Attention
Loki: Low-Rank Keys for Efficient Sparse Attention
Prajwal Singhania
Siddharth Singh
Shwai He
Soheil Feizi
A. Bhatele
42
13
0
04 Jun 2024
Scalable MatMul-free Language Modeling
Scalable MatMul-free Language Modeling
Rui-Jie Zhu
Yu Zhang
Ethan Sifferman
Tyler Sheaves
Yiqiao Wang
Dustin Richmond
P. Zhou
Jason K. Eshraghian
38
17
0
04 Jun 2024
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Namgyu Ho
Sangmin Bae
Taehyeon Kim
Hyunjik Jo
Yireun Kim
Tal Schuster
Adam Fisch
James Thorne
Se-Young Yun
60
8
0
04 Jun 2024
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
Philip Anastassiou
Jiawei Chen
Jingshu Chen
Yuanzhe Chen
Zhuo Chen
...
Wenjie Zhang
Yanzhe Zhang
Zilin Zhao
Dejian Zhong
Xiaobin Zhuang
65
86
0
04 Jun 2024
Learning to Edit Visual Programs with Self-Supervision
Learning to Edit Visual Programs with Self-Supervision
R. K. Jones
Renhao Zhang
Aditya Ganeshan
Daniel E. Ritchie
SSL
44
3
0
04 Jun 2024
Extended Mind Transformers
Extended Mind Transformers
Phoebe Klett
Thomas Ahle
RALM
29
0
0
04 Jun 2024
A Study of Optimizations for Fine-tuning Large Language Models
A Study of Optimizations for Fine-tuning Large Language Models
Arjun Singh
Nikhil Pandey
Anup Shirgaonkar
Pavan Manoj
Vijay Aski
29
4
0
04 Jun 2024
Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in
  Offline Reinforcement Learning
Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in Offline Reinforcement Learning
Jiahang Cao
Qiang Zhang
Ziqing Wang
Jiaxu Wang
Hao Cheng
Yecheng Shao
Wen Zhao
Gang Han
Yijie Guo
Renjing Xu
Mamba
64
2
0
04 Jun 2024
GRAM: Generative Retrieval Augmented Matching of Data Schemas in the
  Context of Data Security
GRAM: Generative Retrieval Augmented Matching of Data Schemas in the Context of Data Security
Xuanqing Liu
Luyang Kong
Runhui Wang
Patrick Song
Austin Nevins
Henrik Johnson
Nimish Amlathe
Davor Golac
47
3
0
04 Jun 2024
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
Tianchen Zhao
Tongcheng Fang
Haofeng Huang
Enshu Liu
Widyadewi Soedarmadji
...
Shengen Yan
Huazhong Yang
Xuefei Ning
Xuefei Ning
Yu Wang
MQ
VGen
112
27
0
04 Jun 2024
Sparsity-Accelerated Training for Large Language Models
Sparsity-Accelerated Training for Large Language Models
Da Ma
Lu Chen
Pengyu Wang
Hongshen Xu
Hanqi Li
Liangtai Sun
Su Zhu
Shuai Fan
Kai Yu
LRM
35
0
0
03 Jun 2024
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via
  Adaptive Heads Fusion
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
Yilong Chen
Linhao Zhang
Junyuan Shang
Zhenyu Zhang
Tingwen Liu
Shuohuan Wang
Yu Sun
46
1
0
03 Jun 2024
Achieving Sparse Activation in Small Language Models
Achieving Sparse Activation in Small Language Models
Jifeng Song
Kai Huang
Xiangyu Yin
Boyuan Yang
Wei Gao
40
4
0
03 Jun 2024
LongSkywork: A Training Recipe for Efficiently Extending Context Length
  in Large Language Models
LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models
Liang Zhao
Tianwen Wei
Liang Zeng
Cheng Cheng
Liu Yang
...
Yimeng Gan
Rui Hu
Shuicheng Yan
Han Fang
Yahui Zhou
LLMAG
SyDa
78
10
0
02 Jun 2024
GenBench: A Benchmarking Suite for Systematic Evaluation of Genomic
  Foundation Models
GenBench: A Benchmarking Suite for Systematic Evaluation of Genomic Foundation Models
Zicheng Liu
Jiahui Li
Siyuan Li
Z. Zang
Cheng Tan
Yufei Huang
Yajing Bai
Stan Z. Li
ELM
37
8
0
01 Jun 2024
Previous
123...131415...272829
Next