ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.14135
  4. Cited By
FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

27 May 2022
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
    VLM
ArXivPDFHTML

Papers citing "FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness"

50 / 1,427 papers shown
Title
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval
Xinyu Wang
Linrui Ma
Jerry Huang
Peng Lu
Prasanna Parthasarathi
Xiao-Wen Chang
Boxing Chen
Yufei Cui
KELM
45
1
0
28 Mar 2025
InternVL-X: Advancing and Accelerating InternVL Series with Efficient Visual Token Compression
InternVL-X: Advancing and Accelerating InternVL Series with Efficient Visual Token Compression
Dongchen Lu
Yuyao Sun
Zilu Zhang
Leping Huang
Jianliang Zeng
Mao Shu
Huo Cao
39
0
0
27 Mar 2025
A Multi-Modal Knowledge-Enhanced Framework for Vessel Trajectory Prediction
A Multi-Modal Knowledge-Enhanced Framework for Vessel Trajectory Prediction
Haomin Yu
Tianyi Li
Kristian Torp
Christian S. Jensen
46
0
0
27 Mar 2025
Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap
Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap
Tong Nie
Jian Sun
Wei Ma
72
1
0
27 Mar 2025
Inductive Link Prediction on N-ary Relational Facts via Semantic Hypergraph Reasoning
Inductive Link Prediction on N-ary Relational Facts via Semantic Hypergraph Reasoning
Gongzhu Yin
H. Zhang
Yuchen Yang
Y. Luo
LRM
85
0
0
26 Mar 2025
Named Entity Recognition in Context
Named Entity Recognition in Context
Colin Brisson
Ayoub Kahfy
Marc Bui
Frédéric Constant
54
0
0
26 Mar 2025
UniEDU: A Unified Language and Vision Assistant for Education Applications
UniEDU: A Unified Language and Vision Assistant for Education Applications
Zhendong Chu
Jian Xie
Shen Wang
Zhilin Wang
Qingsong Wen
AI4Ed
115
0
0
26 Mar 2025
GIViC: Generative Implicit Video Compression
GIViC: Generative Implicit Video Compression
Ge Gao
Siyue Teng
Tianhao Peng
Fan Zhang
David Bull
DiffM
VGen
43
0
0
25 Mar 2025
Bigger But Not Better: Small Neural Language Models Outperform Large Language Models in Detection of Thought Disorder
Bigger But Not Better: Small Neural Language Models Outperform Large Language Models in Detection of Thought Disorder
Changye Li
Weizhe Xu
Serguei V. S. Pakhomov
Ellen Bradley
Dror Ben-Zeev
T. Cohen
39
0
0
25 Mar 2025
Your ViT is Secretly an Image Segmentation Model
Your ViT is Secretly an Image Segmentation Model
Tommie Kerssies
Niccolò Cavagnero
Alexander Hermans
Narges Norouzi
Giuseppe Averta
Bastian Leibe
Gijs Dubbelman
Daan de Geus
ViT
VLM
67
1
0
24 Mar 2025
Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization
Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization
Minsu Kim
Seongmin Hong
RyeoWook Ko
S. Choi
Hunjong Lee
Junsoo Kim
J. Kim
Jongse Park
57
0
0
24 Mar 2025
Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization
Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization
Zhanda Zhu
Christina Giannoula
Muralidhar Andoorveedu
Qidong Su
Karttikeya Mangalam
Bojian Zheng
Gennady Pekhimenko
VLM
MoE
54
0
0
24 Mar 2025
TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model
TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model
Cheng Yang
Yang Sui
Jinqi Xiao
Lingyi Huang
Yu Gong
...
Jinghua Yan
Y. Bai
P. Sadayappan
Xia Hu
Bo Yuan
VLM
61
0
0
24 Mar 2025
WLB-LLM: Workload-Balanced 4D Parallelism for Large Language Model Training
WLB-LLM: Workload-Balanced 4D Parallelism for Large Language Model Training
Zhilin Wang
Anna Cai
Xinfeng Xie
Zaifeng Pan
Yue Guan
...
Shikai Li
Jianyu Huang
Chris Cai
Yuchen Hao
Yufei Ding
39
2
0
23 Mar 2025
OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery
OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery
Vignesh Prabhakar
Md Amirul Islam
Adam Atanas
Yixuan Wang
J. N. Han
...
Rucha Apte
Robert Clark
Kang Xu
Zihan Wang
Kai Liu
LRM
85
1
0
22 Mar 2025
Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReID
Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReID
Yu-Hsi Chen
41
0
0
21 Mar 2025
Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens
Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens
Shuqi Lu
Haowei Lin
Lin Yao
Zhifeng Gao
Xiaohong Ji
W. Elwasif
Linfeng Zhang
Guolin Ke
48
0
0
20 Mar 2025
iFlame: Interleaving Full and Linear Attention for Efficient Mesh Generation
iFlame: Interleaving Full and Linear Attention for Efficient Mesh Generation
Hanxiao Wang
Biao Zhang
Weize Quan
Dong-ming Yan
Peter Wonka
51
0
0
20 Mar 2025
PSA-MIL: A Probabilistic Spatial Attention-Based Multiple Instance Learning for Whole Slide Image Classification
PSA-MIL: A Probabilistic Spatial Attention-Based Multiple Instance Learning for Whole Slide Image Classification
Sharon Peled
Y. Maruvka
Moti Freiman
44
0
0
20 Mar 2025
FutureGen: LLM-RAG Approach to Generate the Future Work of Scientific Article
FutureGen: LLM-RAG Approach to Generate the Future Work of Scientific Article
Ibrahim Al Azher
Miftahul Jannat Mokarrama
Zhishuai Guo
Sagnik Ray Choudhury
Hamed Alhoori
LLMAG
53
1
0
20 Mar 2025
ATTENTION2D: Communication Efficient Distributed Self-Attention Mechanism
ATTENTION2D: Communication Efficient Distributed Self-Attention Mechanism
Venmugil Elango
50
0
0
20 Mar 2025
EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
Boshen Xu
Yuting Mei
Xinbi Liu
Sipeng Zheng
Qin Jin
VLM
MDE
68
0
0
19 Mar 2025
Benchmarking Large Language Models for Handwritten Text Recognition
Benchmarking Large Language Models for Handwritten Text Recognition
Giorgia Crosilla
Lukas Klic
Giovanni Colavizza
40
0
0
19 Mar 2025
EmpathyAgent: Can Embodied Agents Conduct Empathetic Actions?
EmpathyAgent: Can Embodied Agents Conduct Empathetic Actions?
Xinyan Chen
Jiaxin Ge
Hongming Dai
Qiang Zhou
Qiuxuan Feng
Jingtong Hu
Yishuo Wang
Jiaming Liu
Shanghang Zhang
LM&Ro
67
0
0
19 Mar 2025
Prada: Black-Box LLM Adaptation with Private Data on Resource-Constrained Devices
Prada: Black-Box LLM Adaptation with Private Data on Resource-Constrained Devices
Zhilin Wang
Yexiao He
Zheyu Shen
Yu Li
Guoheng Sun
Myungjin Lee
Ang Li
48
0
0
19 Mar 2025
Identifying and Mitigating Position Bias of Multi-image Vision-Language Models
Identifying and Mitigating Position Bias of Multi-image Vision-Language Models
Xinyu Tian
Shu Zou
Zhaoyuan Yang
Jing Zhang
63
0
0
18 Mar 2025
Bolt3D: Generating 3D Scenes in Seconds
Bolt3D: Generating 3D Scenes in Seconds
Stanislaw Szymanowicz
Jason Y. Zhang
P. Srinivasan
Ruiqi Gao
Arthur Brussee
Aleksander Holynski
Ricardo Martín Brualla
Jonathan T. Barron
Philipp Henzler
98
4
0
18 Mar 2025
Growing a Twig to Accelerate Large Vision-Language Models
Growing a Twig to Accelerate Large Vision-Language Models
Zhenwei Shao
Mingyang Wang
Zhou Yu
Wenwen Pan
Yan Yang
Tao Wei
H. Zhang
Ning Mao
Wei Chen
Jun Yu
VLM
64
1
0
18 Mar 2025
SplatVoxel: History-Aware Novel View Streaming without Temporal Training
SplatVoxel: History-Aware Novel View Streaming without Temporal Training
Yiming Wang
Lucy Chai
Xuan Luo
Michael Niemeyer
Manuel Lagunas
Stephen Lombardi
Siyu Tang
Tiancheng Sun
3DGS
58
0
0
18 Mar 2025
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
M. Beck
Korbinian Poppel
Phillip Lippe
Sepp Hochreiter
63
1
0
18 Mar 2025
Theoretical Foundation of Flow-Based Time Series Generation: Provable Approximation, Generalization, and Efficiency
Theoretical Foundation of Flow-Based Time Series Generation: Provable Approximation, Generalization, and Efficiency
Jiangxuan Long
Zhao-quan Song
Chiwun Yang
AI4TS
162
0
0
18 Mar 2025
State Space Model Meets Transformer: A New Paradigm for 3D Object Detection
State Space Model Meets Transformer: A New Paradigm for 3D Object Detection
Chuxin Wang
Wenfei Yang
Xiang Liu
Tianzhu Zhang
59
0
0
18 Mar 2025
Fake Runs, Real Fixes -- Analyzing xPU Performance Through Simulation
Fake Runs, Real Fixes -- Analyzing xPU Performance Through Simulation
Ioannis Zarkadas
Amanda Tomlinson
Asaf Cidon
Baris Kasikci
Ofir Weisse
52
0
0
18 Mar 2025
Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster Inference
Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster Inference
Hao Yin
Guangzong Si
Zilei Wang
51
0
0
17 Mar 2025
MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling
MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling
Yingyue Li
Bencheng Liao
Wenyu Liu
Xinggang Wang
Mamba
61
0
0
17 Mar 2025
Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models
Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models
Teng Wang
Zhangyi Jiang
Zhenqi He
Wenhan Yang
Yanan Zheng
Zeyu Li
Zifan He
Shenyang Tong
Hailei Gong
LRM
90
1
0
16 Mar 2025
Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs
Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs
Nir Ailon
Akhiad Bercovich
Omri Weinstein
57
0
0
15 Mar 2025
Hydra-NeXt: Robust Closed-Loop Driving with Open-Loop Training
Hydra-NeXt: Robust Closed-Loop Driving with Open-Loop Training
Zhenxin Li
Shihao Wang
Shiyi Lan
Zhiding Yu
Zuxuan Wu
Jose M. Alvarez
54
2
0
15 Mar 2025
Similarity-Aware Token Pruning: Your VLM but Faster
Ahmadreza Jeddi
Negin Baghbanzadeh
Elham Dolatabadi
Babak Taati
3DV
VLM
59
1
0
14 Mar 2025
FastVID: Dynamic Density Pruning for Fast Video Large Language Models
Leqi Shen
Guoqiang Gong
Tao He
Yifeng Zhang
Pengzhang Liu
Sicheng Zhao
Guiguang Ding
VLM
69
0
0
14 Mar 2025
TransiT: Transient Transformer for Non-line-of-sight Videography
Ruiqian Li
Siyuan Shen
Suan Xia
Z. Wang
Xingyue Peng
Chengxuan Song
Yingsheng Zhu
Tao Wu
Shiying Li
Jingyi Yu
55
0
0
14 Mar 2025
TigerLLM -- A Family of Bangla Large Language Models
TigerLLM -- A Family of Bangla Large Language Models
Nishat Raihan
Marcos Zampieri
48
0
0
14 Mar 2025
Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores
Chenpeng Wu
Qiqi Gu
Heng Shi
Jianguo Yao
Haibing Guan
MoE
50
0
0
13 Mar 2025
Take Off the Training Wheels Progressive In-Context Learning for Effective Alignment
Zhenyu Liu
Dongfang Li
Xinshuo Hu
X. Zhao
Yibin Chen
Baotian Hu
Min-Ling Zhang
49
1
0
13 Mar 2025
Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling
Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling
Shuqi Lu
Xiaohong Ji
Bohang Zhang
Lin Yao
Siyuan Liu
Zhifeng Gao
Linfeng Zhang
Guolin Ke
AI4CE
46
1
0
13 Mar 2025
EEdit: Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing
EEdit: Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing
Zexuan Yan
Yue Ma
Chang Zou
Wenteng Chen
Qifeng Chen
Linfeng Zhang
63
0
0
13 Mar 2025
ZSMerge: Zero-Shot KV Cache Compression for Memory-Efficient Long-Context LLMs
ZSMerge: Zero-Shot KV Cache Compression for Memory-Efficient Long-Context LLMs
Xin Liu
Pei Liu
Guoming Tang
MoMe
54
0
0
13 Mar 2025
Speedy MASt3R
Jingxing Li
Yongjae Lee
Abhay Kumar Yadav
Cheng-Fang Peng
Rama Chellappa
Deliang Fan
3DGS
61
0
0
13 Mar 2025
Autoregressive Image Generation with Randomized Parallel Decoding
Haopeng Li
Jinyue Yang
Guoqi Li
Huan Wang
55
0
0
13 Mar 2025
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lv
Chenyang Si
Junhao Song
Zhenyu Yang
Yu Qiao
Ziwei Liu
Kwan-Yee K. Wong
VGen
DiffM
84
8
0
13 Mar 2025
Previous
123456...272829
Next