ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.08691
  4. Cited By
FlashAttention-2: Faster Attention with Better Parallelism and Work
  Partitioning

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

17 July 2023
Tri Dao
    LRM
ArXiv (abs)PDFHTML

Papers citing "FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning"

50 / 329 papers shown
Title
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
Akshara Prabhakar
Ziqiang Liu
Weiran Yao
Jianguo Zhang
Ming Zhu
...
Juan Carlos Niebles
Shelby Heinecke
Han Wang
Siyang Song
Caiming Xiong
VGen
158
11
0
04 Apr 2025
Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training
Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training
Zhijun Wang
Jiahuan Li
Hao Zhou
Rongxiang Weng
Jiadong Wang
Xin Huang
Xue Han
Junlan Feng
Chao Deng
Shujian Huang
LRM
106
3
0
02 Apr 2025
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding
Sakhinana Sagar Srinivas
Akash Das
Shivam Gupta
Venkataramana Runkana
OffRL
122
1
0
02 Apr 2025
FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
Hang Guo
Yawei Li
Taolin Zhang
Jiadong Wang
Tao Dai
Shu-Tao Xia
Luca Benini
160
5
0
30 Mar 2025
Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities
Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities
Raman Dutt
Harleen Hanspal
Guoxuan Xia
Petru-Daniel Tudosiu
Alexander Black
Yongxin Yang
Jingyu Sun
Sarah Parisot
MoE
102
0
0
28 Mar 2025
Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence
Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence
Yijiong Yu
LRMAIMat
177
1
0
26 Mar 2025
Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReID
Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReID
Yu-Hsi Chen
118
0
0
21 Mar 2025
ATTENTION2D: Communication Efficient Distributed Self-Attention Mechanism
ATTENTION2D: Communication Efficient Distributed Self-Attention Mechanism
Venmugil Elango
106
0
0
20 Mar 2025
From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment
From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment
Jia-Nan Li
Jian Guan
Songhao Wu
Wei Wu
Rui Yan
169
3
0
19 Mar 2025
MASS: Mathematical Data Selection via Skill Graphs for Pretraining Large Language Models
MASS: Mathematical Data Selection via Skill Graphs for Pretraining Large Language Models
Jia-Nan Li
Lu Yu
Daixin Wang
Qing Cui
Jun Zhou
Yanfang Ye
Chuxu Zhang
122
0
0
19 Mar 2025
Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis
Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis
Imanol G. Estepa
Jesús M. Rodríguez-de-Vera
Ignacio Sarasúa
Bhalaji Nagarajan
Petia Radeva
213
0
0
19 Mar 2025
ML-Triton, A Multi-Level Compilation and Language Extension to Triton GPU Programming
ML-Triton, A Multi-Level Compilation and Language Extension to Triton GPU Programming
Dewei Wang
Wei Zhu
Liyang Ling
Ettore Tiotto
Quintin Wang
Whitney Tsang
Julian Opperman
Jacky Deng
72
0
0
19 Mar 2025
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
M. Beck
Korbinian Poppel
Phillip Lippe
Sepp Hochreiter
164
4
0
18 Mar 2025
Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models
Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models
Yuxiang Lai
Shitian Zhao
Ming Li
Jike Zhong
Xiaofeng Yang
OffRLLRMLM&MAVLM
186
31
0
18 Mar 2025
AccelGen: Heterogeneous SLO-Guaranteed High-Throughput LLM Inference Serving for Diverse Applications
AccelGen: Heterogeneous SLO-Guaranteed High-Throughput LLM Inference Serving for Diverse Applications
Haiying Shen
Tanmoy Sen
86
0
0
17 Mar 2025
SAM2 for Image and Video Segmentation: A Comprehensive Survey
SAM2 for Image and Video Segmentation: A Comprehensive Survey
Zhang Jiaxing
Tang Hao
VLM
110
0
0
17 Mar 2025
CAKE: Cascading and Adaptive KV Cache Eviction with Layer Preferences
CAKE: Cascading and Adaptive KV Cache Eviction with Layer Preferences
Ziran Qin
Yuchen Cao
Mingbao Lin
Wen Hu
Shixuan Fan
Ke Cheng
Weiyao Lin
Jianguo Li
127
5
0
16 Mar 2025
Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models
Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models
Teng Wang
Zhangyi Jiang
Zhenqi He
Wenhan Yang
Yanan Zheng
Zeyu Li
Zifan He
Shenyang Tong
Hailei Gong
LRM
169
2
0
16 Mar 2025
Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores
Chenpeng Wu
Qiqi Gu
Heng Shi
Jianguo Yao
Haibing Guan
MoE
78
0
0
13 Mar 2025
Accurate INT8 Training Through Dynamic Block-Level Fallback
Accurate INT8 Training Through Dynamic Block-Level Fallback
Pengle Zhang
Jia Wei
Jintao Zhang
Jun-Jie Zhu
Jianfei Chen
MQ
173
9
0
11 Mar 2025
Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention
Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention
Emily Xiao
Chin-Jou Li
Yilin Zhang
Graham Neubig
Amanda Bertsch
BDL
116
1
0
11 Mar 2025
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
Weigao Sun
Disen Lan
Tong Zhu
Xiaoye Qu
Yu Cheng
MoE
236
4
0
07 Mar 2025
Development and Enhancement of Text-to-Image Diffusion Models
Rajdeep Roshan Sahu
VLM
160
44
0
07 Mar 2025
EuroBERT: Scaling Multilingual Encoders for European Languages
EuroBERT: Scaling Multilingual Encoders for European Languages
Nicolas Boizard
Hippolyte Gisserot-Boukhlef
Duarte M. Alves
André F. T. Martins
Ayoub Hammal
...
Maxime Peyrard
Nuno M. Guerreiro
Patrick Fernandes
Ricardo Rei
Pierre Colombo
528
3
0
07 Mar 2025
Predicting Team Performance from Communications in Simulated Search-and-Rescue
Ali Jalal-Kamali
Nikolos Gurney
David Pynadath
AI4TS
191
14
0
05 Mar 2025
Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer
Yujiao Yang
Jing Lian
Linhui Li
MoE
139
0
0
04 Mar 2025
LADM: Long-context Training Data Selection with Attention-based Dependency Measurement for LLMs
LADM: Long-context Training Data Selection with Attention-based Dependency Measurement for LLMs
Jianghao Chen
Junhong Wu
Yangyifan Xu
J.N. Zhang
105
1
0
04 Mar 2025
Tera-MIND: Tera-scale mouse brain simulation via spatial mRNA-guided diffusion
Jiqing Wu
Ingrid Berg
Yawei Li
Ender Konukoglu
V. Koelzer
AI4CE
134
0
0
03 Mar 2025
Dialogue Without Limits: Constant-Sized KV Caches for Extended Responses in LLMs
Dialogue Without Limits: Constant-Sized KV Caches for Extended Responses in LLMs
Ravi Ghadia
Avinash Kumar
Gaurav Jain
Prashant J. Nair
Poulami Das
87
2
0
02 Mar 2025
Smoothing Grounding and Reasoning for MLLM-Powered GUI Agents with Query-Oriented Pivot Tasks
Zongru Wu
Pengzhou Cheng
Zheng Wu
Tianjie Ju
Zhuosheng Zhang
Gongshen Liu
LRM
93
4
0
01 Mar 2025
Sentence-level Reward Model can Generalize Better for Aligning LLM from Human Preference
Sentence-level Reward Model can Generalize Better for Aligning LLM from Human Preference
Wenjie Qiu
Yi-Chen Li
Xuqin Zhang
Tianyi Zhang
Yiming Zhang
Zongzhang Zhang
Yang Yu
ALM
109
1
0
01 Mar 2025
Training LLMs with MXFP4
Training LLMs with MXFP4
Albert Tseng
Tao Yu
Youngsuk Park
93
5
0
27 Feb 2025
END: Early Noise Dropping for Efficient and Effective Context Denoising
END: Early Noise Dropping for Efficient and Effective Context Denoising
Hongye Jin
Pei Chen
Jingfeng Yang
Zhaoxiang Wang
Meng Jiang
...
Wei Wei
Zheng Li
Tianyi Liu
Huasheng Li
Bing Yin
435
1
0
26 Feb 2025
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
Taishi Nakamura
Takuya Akiba
Kazuki Fujii
Yusuke Oda
Rio Yokota
Jun Suzuki
MoMeMoE
134
2
0
26 Feb 2025
NeoBERT: A Next-Generation BERT
NeoBERT: A Next-Generation BERT
Lola Le Breton
Quentin Fournier
Mariam El Mezouar
John X. Morris
Sarath Chandar
AI4TS
137
1
0
26 Feb 2025
On Synthetic Data Strategies for Domain-Specific Generative Retrieval
On Synthetic Data Strategies for Domain-Specific Generative Retrieval
Haoyang Wen
Jiang Guo
Yi Zhang
Jiarong Jiang
Ziyi Wang
SyDa
125
1
0
25 Feb 2025
Erwin: A Tree-based Hierarchical Transformer for Large-scale Physical Systems
Erwin: A Tree-based Hierarchical Transformer for Large-scale Physical Systems
Maksim Zhdanov
Max Welling
Jan-Willem van de Meent
AI4CE
125
3
0
24 Feb 2025
Training a Generally Curious Agent
Training a Generally Curious Agent
Fahim Tajwar
Yiding Jiang
Abitha Thankaraj
Sumaita Sadia Rahman
J. Zico Kolter
Jeff Schneider
Ruslan Salakhutdinov
237
3
0
24 Feb 2025
UrduLLaMA 1.0: Dataset Curation, Preprocessing, and Evaluation in Low-Resource Settings
UrduLLaMA 1.0: Dataset Curation, Preprocessing, and Evaluation in Low-Resource Settings
Layba Fiaz
Munief Hassan Tahir
Sana Shams
Sarmad Hussain
95
0
0
24 Feb 2025
Vision-LSTM: xLSTM as Generic Vision Backbone
Vision-LSTM: xLSTM as Generic Vision Backbone
Benedikt Alkin
M. Beck
Korbinian Poppel
Sepp Hochreiter
Johannes Brandstetter
VLM
235
49
0
24 Feb 2025
DeepInteraction++: Multi-Modality Interaction for Autonomous Driving
DeepInteraction++: Multi-Modality Interaction for Autonomous Driving
Zeyu Yang
Nan Song
Wei Li
Xiatian Zhu
Lefei Zhang
Philip H. S. Torr
162
4
0
24 Feb 2025
AttentionEngine: A Versatile Framework for Efficient Attention Mechanisms on Diverse Hardware Platforms
AttentionEngine: A Versatile Framework for Efficient Attention Mechanisms on Diverse Hardware Platforms
Feiyang Chen
Yu Cheng
Lei Wang
Yuqing Xia
Ziming Miao
...
Fan Yang
Jinbao Xue
Zhi Yang
M. Yang
H. Chen
127
1
0
24 Feb 2025
WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
Jiaxi Li
Xingxing Zhang
Xun Wang
Xiaolong Huang
Li Dong
Liang Wang
Si-Qing Chen
Wei Lu
Furu Wei
SyDa
476
1
0
23 Feb 2025
SQLong: Enhanced NL2SQL for Longer Contexts with LLMs
SQLong: Enhanced NL2SQL for Longer Contexts with LLMs
Dai Quoc Nguyen
Cong Duy Vu Hoang
Duy Vu
Gioacchino Tangari
Thanh Tien Vu
Don Dharmasiri
Yuan-Fang Li
Long Duong
112
0
0
23 Feb 2025
Fine-Tuning Qwen 2.5 3B for Realistic Movie Dialogue Generation
Fine-Tuning Qwen 2.5 3B for Realistic Movie Dialogue Generation
Kartik Gupta
VGen
89
1
0
22 Feb 2025
Neural Attention Search
Neural Attention Search
Difan Deng
Marius Lindauer
146
0
0
21 Feb 2025
Self-Taught Agentic Long Context Understanding
Self-Taught Agentic Long Context Understanding
Yufan Zhuang
Xiaodong Yu
Jialian Wu
Xingwu Sun
Zihan Wang
Jiang Liu
Yusheng Su
Jingbo Shang
Zicheng Liu
Emad Barsoum
LRM
87
0
0
21 Feb 2025
Surface Vision Mamba: Leveraging Bidirectional State Space Model for Efficient Spherical Manifold Representation
Surface Vision Mamba: Leveraging Bidirectional State Space Model for Efficient Spherical Manifold Representation
Rongzhao He
Weihao Zheng
Leilei Zhao
Ying Wang
Dalin Zhu
Dan Wu
Bin Hu
Mamba
196
0
0
21 Feb 2025
Simpler Fast Vision Transformers with a Jumbo CLS Token
Simpler Fast Vision Transformers with a Jumbo CLS Token
A. Fuller
Yousef Yassin
Daniel G. Kyrollos
Evan Shelhamer
James R. Green
203
0
0
20 Feb 2025
Multilingual Language Model Pretraining using Machine-translated Data
Multilingual Language Model Pretraining using Machine-translated Data
Jiayi Wang
Yao Lu
Maurice Weber
Max Ryabinin
David Ifeoluwa Adelani
Yihong Chen
Raphael Tang
Pontus Stenetorp
LRM
130
5
0
20 Feb 2025
Previous
1234567
Next