ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.08691
  4. Cited By
FlashAttention-2: Faster Attention with Better Parallelism and Work
  Partitioning

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

17 July 2023
Tri Dao
    LRM
ArXivPDFHTML

Papers citing "FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning"

50 / 230 papers shown
Title
PSC: Extending Context Window of Large Language Models via Phase Shift Calibration
PSC: Extending Context Window of Large Language Models via Phase Shift Calibration
Wenqiao Zhu
Chao Xu
Lulu Wang
Jun Wu
12
1
0
18 May 2025
GeoMaNO: Geometric Mamba Neural Operator for Partial Differential Equations
GeoMaNO: Geometric Mamba Neural Operator for Partial Differential Equations
Xi Han
Jingwei Zhang
Dimitris Samaras
Fei Hou
Hong Qin
AI4CE
2
0
0
17 May 2025
AutoMedEval: Harnessing Language Models for Automatic Medical Capability Evaluation
AutoMedEval: Harnessing Language Models for Automatic Medical Capability Evaluation
X. Zhang
Zetian Ouyang
Linlin Wang
Gerard de Melo
Zhu Cao
Xiaoling Wang
Ya Zhang
Yanfeng Wang
Liang He
LM&MA
ELM
18
0
0
17 May 2025
Fast RoPE Attention: Combining the Polynomial Method and Fast Fourier Transform
Fast RoPE Attention: Combining the Polynomial Method and Fast Fourier Transform
Josh Alman
Zhao Song
15
0
0
17 May 2025
Flash Invariant Point Attention
Flash Invariant Point Attention
Andrew Liu
Axel Elaldi
Nicholas T Franklin
Nathan Russell
Gurinder S Atwal
Yih-En A Ban
Olivia Viessmann
4
0
0
16 May 2025
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Chenggang Zhao
Chengqi Deng
Chong Ruan
Damai Dai
Huazuo Gao
...
Wenfeng Liang
Ying He
Yishuo Wang
Yuxuan Liu
Y. X. Wei
MoE
41
0
0
14 May 2025
Accelerating Machine Learning Systems via Category Theory: Applications to Spherical Attention for Gene Regulatory Networks
Accelerating Machine Learning Systems via Category Theory: Applications to Spherical Attention for Gene Regulatory Networks
Vincent Abbott
Kotaro Kamiya
Gerard Glowacki
Yu Atsumi
Gioele Zardini
Yoshihiro Maruyama
29
0
0
14 May 2025
OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning
OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning
Zhaochen Su
Linjie Li
Mingyang Song
Yunzhuo Hao
Zhengyuan Yang
...
Guanjie Chen
Jiawei Gu
Juntao Li
Xiaoye Qu
Yu Cheng
OffRL
LRM
31
0
0
13 May 2025
Fused3S: Fast Sparse Attention on Tensor Cores
Fused3S: Fast Sparse Attention on Tensor Cores
Zitong Li
Aparna Chandramowlishwaran
GNN
47
0
0
12 May 2025
I Know What You Said: Unveiling Hardware Cache Side-Channels in Local Large Language Model Inference
I Know What You Said: Unveiling Hardware Cache Side-Channels in Local Large Language Model Inference
Zibo Gao
Junjie Hu
Feng Guo
Yixin Zhang
Yinglong Han
Siyuan Liu
Haiyang Li
Zhiqiang Lv
31
0
0
10 May 2025
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language Understanding
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language Understanding
Dawei Huang
Qing Li
Chuan Yan
Zebang Cheng
Jiaming Ji
Xiang Li
Yangqiu Song
Xiaobei Wang
Zheng Lian
Xiaojiang Peng
29
0
0
10 May 2025
ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling
ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling
Xiao Wang
Jong Youl Choi
Takuya Kurihaya
Isaac Lyngaas
Hong-Jun Yoon
...
Dali Wang
Peter Thornton
Prasanna Balaprakash
M. Ashfaq
Dan Lu
30
0
0
07 May 2025
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
D. Jiang
Mengmeng Wang
Liuzhuozheng Li
Lei Zhang
Haoyu Wang
Wei Wei
Guang Dai
Yanning Zhang
Jingdong Wang
DiffM
51
0
0
05 May 2025
PipeSpec: Breaking Stage Dependencies in Hierarchical LLM Decoding
PipeSpec: Breaking Stage Dependencies in Hierarchical LLM Decoding
Bradley McDanel
S. Zhang
Y. Hu
Zining Liu
MoE
172
0
0
02 May 2025
CodeSSM: Towards State Space Models for Code Understanding
CodeSSM: Towards State Space Models for Code Understanding
Shweta Verma
Abhinav Anand
Mira Mezini
Mamba
46
0
0
02 May 2025
Phantora: Live GPU Cluster Simulation for Machine Learning System Performance Estimation
Phantora: Live GPU Cluster Simulation for Machine Learning System Performance Estimation
Jianxing Qin
Jingrong Chen
Xinhao Kong
Yongji Wu
Liang Luo
Zihan Wang
Ying Zhang
Tingjun Chen
Alvin R. Lebeck
Danyang Zhuo
148
0
0
02 May 2025
RayZer: A Self-supervised Large View Synthesis Model
RayZer: A Self-supervised Large View Synthesis Model
Hanwen Jiang
Hao Tan
Peng Wang
Haian Jin
Yue Zhao
...
Kai Zhang
Fujun Luan
Kalyan Sunkavalli
Qixing Huang
Georgios Pavlakos
68
0
0
01 May 2025
FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension
FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension
Jushi Kai
Boyi Zeng
Yansen Wang
Haoli Bai
Bo Jiang
Bo Jiang
Zhouhan Lin
44
0
0
01 May 2025
Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook
Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook
Muyi Bao
Shuchang Lyu
Zhaoyang Xu
Huiyu Zhou
Jinchang Ren
Shiming Xiang
Xuelong Li
Guangliang Cheng
Mamba
87
0
0
01 May 2025
GPU Performance Portability needs Autotuning
GPU Performance Portability needs Autotuning
Burkhard Ringlein
Thomas Parnell
Radu Stoica
173
0
0
30 Apr 2025
Blockbuster, Part 1: Block-level AI Operator Fusion
Blockbuster, Part 1: Block-level AI Operator Fusion
Ofer Dekel
21
0
0
29 Apr 2025
Softpick: No Attention Sink, No Massive Activations with Rectified Softmax
Softpick: No Attention Sink, No Massive Activations with Rectified Softmax
Zayd Muhammad Kawakibi Zuhri
Erland Hilman Fuadi
Alham Fikri Aji
33
0
0
29 Apr 2025
Search-Based Interaction For Conversation Recommendation via Generative Reward Model Based Simulated User
Search-Based Interaction For Conversation Recommendation via Generative Reward Model Based Simulated User
Xueliang Wang
Chunxuan Xia
Junyi Li
Fanzhe Meng
Lei Huang
Jinpeng Wang
Wayne Xin Zhao
Ji-Rong Wen
63
0
0
29 Apr 2025
LR-IAD:Mask-Free Industrial Anomaly Detection with Logical Reasoning
LR-IAD:Mask-Free Industrial Anomaly Detection with Logical Reasoning
Peijian Zeng
Feiyan Pang
Zhanbo Wang
Aimin Yang
74
0
0
28 Apr 2025
LIRM: Large Inverse Rendering Model for Progressive Reconstruction of Shape, Materials and View-dependent Radiance Fields
LIRM: Large Inverse Rendering Model for Progressive Reconstruction of Shape, Materials and View-dependent Radiance Fields
Zhengqin Li
Dilin Wang
Ka Chen
Zhaoyang Lv
Thu Nguyen-Phuoc
...
Yufeng Zhu
Carl S. Marshall
Yufeng Ren
Richard Newcombe
Zhao Dong
3DV
85
0
0
28 Apr 2025
Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation
Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation
Yi Lu
Wanxu Zhao
Xin Zhou
Chenxin An
Cong Wang
...
Jun Zhao
Tao Ji
Tao Gui
Qi Zhang
Xuanjing Huang
46
0
0
26 Apr 2025
Tempo: Application-aware LLM Serving with Mixed SLO Requirements
Tempo: Application-aware LLM Serving with Mixed SLO Requirements
Wei Zhang
Zhiyu Wu
Yi Mu
Banruo Liu
Myungjin Lee
Fan Lai
60
0
0
24 Apr 2025
Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark
Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark
Hanlei Zhang
Zhuohang Li
Yeshuang Zhu
Hua Xu
Peiwu Wang
Haige Zhu
Jie Zhou
Jinchao Zhang
43
0
0
23 Apr 2025
Efficient Pretraining Length Scaling
Efficient Pretraining Length Scaling
Bohong Wu
Shen Yan
Sijun Zhang
Jianqiao Lu
Yutao Zeng
Ya Wang
Xun Zhou
182
0
0
21 Apr 2025
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
Xinlin Zhuang
Jiahui Peng
Ren Ma
Yucheng Wang
Tianyi Bai
Xingjian Wei
Jiantao Qiu
Chi Zhang
Ying Qian
Conghui He
53
0
0
19 Apr 2025
How Well Can General Vision-Language Models Learn Medicine By Watching Public Educational Videos?
How Well Can General Vision-Language Models Learn Medicine By Watching Public Educational Videos?
Rahul Thapa
Andrew Li
Qingyang Wu
B. He
Yuki Sahashi
...
Angela Zhang
Ben Athiwaratkun
Shuaiwen Leon Song
David Ouyang
James Zou
LM&MA
49
0
0
19 Apr 2025
CSPLADE: Learned Sparse Retrieval with Causal Language Models
CSPLADE: Learned Sparse Retrieval with Causal Language Models
Zhichao Xu
Aosong Feng
Yijun Tian
Haibo Ding
Lin Leee Cheong
RALM
47
0
0
15 Apr 2025
Generative Large Language Model usage in Smart Contract Vulnerability Detection
Generative Large Language Model usage in Smart Contract Vulnerability Detection
Peter Ince
Jiangshan Yu
Joseph K. Liu
Xiaoning Du
37
0
0
07 Apr 2025
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
Akshara Prabhakar
Ziqiang Liu
Weiran Yao
Jianguo Zhang
Ming Zhu
...
Juan Carlos Niebles
Shelby Heinecke
Han Wang
Shri Kiran Srinivasan
Caiming Xiong
VGen
90
2
0
04 Apr 2025
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding
Sakhinana Sagar Srinivas
Akash Das
Shivam Gupta
Venkataramana Runkana
OffRL
52
1
0
02 Apr 2025
Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training
Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training
Zhijun Wang
Jiahuan Li
Hao Zhou
Rongxiang Weng
Jie Wang
Xin Huang
Xue Han
Junlan Feng
Chao Deng
Shujian Huang
LRM
59
1
0
02 Apr 2025
FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
Hang Guo
Yawei Li
Taolin Zhang
Jie Wang
Tao Dai
Shu-Tao Xia
Luca Benini
72
2
0
30 Mar 2025
Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities
Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities
Raman Dutt
Harleen Hanspal
Guoxuan Xia
Petru-Daniel Tudosiu
Alexander Black
Yongxin Yang
Jingyu Sun
Sarah Parisot
MoE
43
0
0
28 Mar 2025
Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence
Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence
Yijiong Yu
LRM
AIMat
92
1
0
26 Mar 2025
Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReID
Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReID
Yu-Hsi Chen
44
0
0
21 Mar 2025
SkyLadder: Better and Faster Pretraining via Context Window Scheduling
SkyLadder: Better and Faster Pretraining via Context Window Scheduling
Tongyao Zhu
Qian Liu
Haonan Wang
Shiqi Chen
Xiangming Gu
Tianyu Pang
Min-Yen Kan
44
0
0
19 Mar 2025
From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment
From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment
J. Li
Jian Guan
Songhao Wu
Wei Wu
Rui Yan
70
1
0
19 Mar 2025
ML-Triton, A Multi-Level Compilation and Language Extension to Triton GPU Programming
ML-Triton, A Multi-Level Compilation and Language Extension to Triton GPU Programming
Dewei Wang
Wei Zhu
Liyang Ling
Ettore Tiotto
Quintin Wang
Whitney Tsang
Julian Opperman
Jacky Deng
46
0
0
19 Mar 2025
Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis
Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis
Imanol G. Estepa
Jesús M. Rodríguez-de-Vera
Ignacio Sarasúa
Bhalaji Nagarajan
Petia Radeva
54
0
0
19 Mar 2025
Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models
Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models
Yuxiang Lai
Shitian Zhao
Ming Li
Jike Zhong
Xiaofeng Yang
OffRL
LRM
LM&MA
VLM
81
11
0
18 Mar 2025
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
M. Beck
Korbinian Poppel
Phillip Lippe
Sepp Hochreiter
69
1
0
18 Mar 2025
Pensez: Less Data, Better Reasoning -- Rethinking French LLM
Pensez: Less Data, Better Reasoning -- Rethinking French LLM
Huy Hoang Ha
ReLM
LRM
68
1
0
17 Mar 2025
Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models
Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models
Teng Wang
Zhangyi Jiang
Zhenqi He
Wenhan Yang
Yanan Zheng
Zeyu Li
Zifan He
Shenyang Tong
Hailei Gong
LRM
90
2
0
16 Mar 2025
Cost-Optimal Grouped-Query Attention for Long-Context Modeling
Cost-Optimal Grouped-Query Attention for Long-Context Modeling
Yuxiao Chen
Yutong Wu
Chenyang Song
Zhiyuan Liu
Maosong Sun
Xu Han
Zhiyuan Liu
Maosong Sun
73
0
0
12 Mar 2025
EuroBERT: Scaling Multilingual Encoders for European Languages
EuroBERT: Scaling Multilingual Encoders for European Languages
Nicolas Boizard
Hippolyte Gisserot-Boukhlef
Duarte M. Alves
André F. T. Martins
Ayoub Hammal
...
Maxime Peyrard
Nuno M. Guerreiro
Patrick Fernandes
Ricardo Rei
Pierre Colombo
169
2
0
07 Mar 2025
12345
Next