Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.14135
Cited By
v1
v2 (latest)
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
27 May 2022
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness"
50 / 1,508 papers shown
Title
Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning
Xiao Wang
Tianze Chen
Xianjun Yang
Qi Zhang
Xun Zhao
Dahua Lin
ELM
82
7
0
16 Apr 2024
Referring Flexible Image Restoration
Runwei Guan
Rongsheng Hu
Zhuhao Zhou
Tianlang Xue
Ka Lok Man
Jeremy S. Smith
Eng Gee Lim
Weiping Ding
Yutao Yue
81
0
0
16 Apr 2024
Long-form music generation with latent diffusion
Zach Evans
Julian Parker
CJ Carr
Zack Zukowski
Josiah Taylor
Jordi Pons
MGen
DiffM
122
45
0
16 Apr 2024
Improving the Capabilities of Large Language Model Based Marketing Analytics Copilots With Semantic Search And Fine-Tuning
Yilin Gao
Arava Sai Kumar
Yancheng Li
James W. Snyder
AI4MH
104
2
0
16 Apr 2024
Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology
Oren Z. Kraus
Kian Kenyon-Dean
Saber Saberian
Maryam Fallah
Peter McLean
...
Chi Vicky Cheng
Kristen Morse
Maureen Makes
Ben Mabey
Berton Earnshaw
70
35
0
16 Apr 2024
I/O in Machine Learning Applications on HPC Systems: A 360-degree Survey
Noah Lewis
J. L. Bez
Suren Byna
109
0
0
16 Apr 2024
Adaptive Patching for High-resolution Image Segmentation with Transformers
Enzhi Zhang
Isaac Lyngaas
Peng Chen
Xiao Wang
Jun Igarashi
Yuankai Huo
Mohamed Wahib
M. Munetomo
MedIm
75
2
0
15 Apr 2024
LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism
Bingya Wu
Shengyu Liu
Yinmin Zhong
Peng Sun
Xuanzhe Liu
Xin Jin
RALM
106
63
0
15 Apr 2024
Exploring and Improving Drafts in Blockwise Parallel Decoding
Taehyeon Kim
A. Suresh
Kishore Papineni
Michael Riley
Sanjiv Kumar
Adrian Benton
AI4TS
90
2
0
14 Apr 2024
CATS: Contextually-Aware Thresholding for Sparsity in Large Language Models
Je-Yong Lee
Donghyun Lee
Genghan Zhang
Mo Tiwari
Azalia Mirhoseini
73
21
0
12 Apr 2024
Reducing hallucination in structured outputs via Retrieval-Augmented Generation
Patrice Béchard
Orlando Marquez Ayala
LLMAG
96
61
0
12 Apr 2024
Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
Tanmay Gautam
Youngsuk Park
Hao Zhou
Parameswaran Raman
Wooseok Ha
100
17
0
11 Apr 2024
Behavior Trees Enable Structured Programming of Language Model Agents
Richard Kelley
AI4CE
LM&Ro
LLMAG
99
0
0
11 Apr 2024
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Tsendsuren Munkhdalai
Manaal Faruqui
Siddharth Gopal
LRM
LLMAG
CLL
157
121
0
10 Apr 2024
FiP: a Fixed-Point Approach for Causal Generative Modeling
M. Scetbon
Joel Jennings
Agrin Hilmkil
Cheng Zhang
Chao Ma
123
3
0
10 Apr 2024
Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation
Thomas Merth
Qichen Fu
Mohammad Rastegari
Mahyar Najibi
LRM
RALM
102
10
0
10 Apr 2024
CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers
Longwei Zou
Qingyang Wang
Han Zhao
Jiangang Kong
Yi Yang
Yangdong Deng
98
0
0
10 Apr 2024
Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks
Chonghua Wang
Haodong Duan
Songyang Zhang
Dahua Lin
Kai-xiang Chen
ELM
82
23
0
09 Apr 2024
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Bo Peng
Daniel Goldstein
Quentin G. Anthony
Alon Albalak
Eric Alcaide
...
Bingchen Zhao
Qihang Zhao
Peng Zhou
Jian Zhu
Ruijie Zhu
119
82
0
08 Apr 2024
Xiwu: A Basis Flexible and Learnable LLM for High Energy Physics
Zhengde Zhang
Yiyu Zhang
Haodong Yao
Jianwen Luo
Rui Zhao
...
Ke Li
Lina Zhao
Jun Cao
Fazhi Qi
Changzheng Yuan
52
2
0
08 Apr 2024
MemFlow: Optical Flow Estimation and Prediction with Memory
Qiaole Dong
Yanwei Fu
109
21
0
07 Apr 2024
Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts
Weilin Cai
Juyong Jiang
Le Qin
Junwei Cui
Sunghun Kim
Jiayi Huang
185
10
0
07 Apr 2024
Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval
Joao Coelho
Bruno Martins
João Magalhães
Jamie Callan
Chenyan Xiong
RALM
102
8
0
05 Apr 2024
Sailor: Open Language Models for South-East Asia
Longxu Dou
Qian Liu
Guangtao Zeng
Jia Guo
Jiahui Zhou
Wei Lu
Min Lin
LRM
106
9
0
04 Apr 2024
VIAssist: Adapting Multi-modal Large Language Models for Users with Visual Impairments
Bufang Yang
Lixing He
Kaiwei Liu
Zhenyu Yan
111
22
0
03 Apr 2024
Enhancing Human-Computer Interaction in Chest X-ray Analysis using Vision and Language Model with Eye Gaze Patterns
Yunsoo Kim
Jinge Wu
Yusuf Abdulle
Yue Gao
Honghan Wu
62
5
0
03 Apr 2024
Linear Attention Sequence Parallelism
Weigao Sun
Zhen Qin
Dong Li
Xuyang Shen
Yu Qiao
Yiran Zhong
150
2
0
03 Apr 2024
Emergent Abilities in Reduced-Scale Generative Language Models
Sherin Muckatira
Vijeta Deshpande
Vladislav Lialin
Anna Rumshisky
ReLM
ELM
LRM
66
5
0
02 Apr 2024
HyperCLOVA X Technical Report
Kang Min Yoo
Jaegeun Han
Sookyo In
Heewon Jeon
Jisu Jeong
...
Hyunkyung Noh
Se-Eun Choi
Sang-Woo Lee
Jung Hwa Lim
Nako Sung
VLM
88
9
0
02 Apr 2024
Position-Aware Parameter Efficient Fine-Tuning Approach for Reducing Positional Bias in LLMs
Zheng Zhang
Fan Yang
Ziyan Jiang
Zheng Chen
Zhengyang Zhao
Chengyuan Ma
Liang Zhao
Yang Liu
67
6
0
01 Apr 2024
Stream of Search (SoS): Learning to Search in Language
Kanishk Gandhi
Denise Lee
Gabriel Grand
Muxin Liu
Winson Cheng
Archit Sharma
Noah D. Goodman
RALM
AIFin
LRM
100
68
0
01 Apr 2024
On Difficulties of Attention Factorization through Shared Memory
Uladzislau Yorsh
Martin Holevna
Ondrej Bojar
David Herel
55
0
0
31 Mar 2024
DailyMAE: Towards Pretraining Masked Autoencoders in One Day
Jiantao Wu
Shentong Mo
Sara Atito
Zhenhua Feng
Josef Kittler
Muhammad Awais
84
3
0
31 Mar 2024
QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
Saleh Ashkboos
Amirkeivan Mohtashami
Maximilian L. Croci
Bo Li
Martin Jaggi
Dan Alistarh
Torsten Hoefler
James Hensman
MQ
145
184
0
30 Mar 2024
Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks
Hyunjae Kim
Hyeon Hwang
Jiwoo Lee
Sihyeon Park
Dain Kim
Taewhoo Lee
Chanwoong Yoon
Jiwoong Sohn
Donghee Choi
Jaewoo Kang
ELM
AI4MH
LRM
118
22
0
30 Mar 2024
DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference
Jinwei Yao
Kaiqi Chen
Kexun Zhang
Jiaxuan You
Binhang Yuan
Zeke Wang
Tao Lin
114
4
0
30 Mar 2024
Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference
Jovan Stojkovic
Esha Choukse
Chaojie Zhang
Inigo Goiri
Josep Torrellas
78
41
0
29 Mar 2024
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs
Luchang Li
Sheng Qian
Jie Lu
Lunxi Yuan
Rui Wang
Qin Xie
87
10
0
29 Mar 2024
CtRL-Sim: Reactive and Controllable Driving Agents with Offline Reinforcement Learning
Luke Rowe
Roger Girgis
Anthony Gosselin
Bruno Carrez
Florian Golemo
Felix Heide
Liam Paull
Christopher Pal
119
7
0
29 Mar 2024
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Elliot Bolton
Abhinav Venigalla
Michihiro Yasunaga
David Leo Wright Hall
Betty Xiong
...
R. Daneshjou
Jonathan Frankle
Percy Liang
Michael Carbin
Christopher D. Manning
LM&MA
MedIm
101
64
0
27 Mar 2024
Integrative Graph-Transformer Framework for Histopathology Whole Slide Image Representation and Classification
Zhan Shi
Jingwei Zhang
Jun Kong
Fusheng Wang
MedIm
96
5
0
26 Mar 2024
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
Boyao Wang
Xiang Liu
Shizhe Diao
Renjie Pi
Jipeng Zhang
Chi Han
Tong Zhang
106
55
0
26 Mar 2024
ArabicaQA: A Comprehensive Dataset for Arabic Question Answering
Abdelrahman Abdallah
M. Kasem
Mahmoud Abdalla
Mohamed Mahmoud
Mohamed Elkasaby
Yasser Elbendary
Adam Jatowt
RALM
84
16
0
26 Mar 2024
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Chenhongyi Yang
Zehui Chen
Miguel Espinosa
Linus Ericsson
Zhenyu Wang
Jiaming Liu
Elliot J. Crowley
Mamba
119
99
0
26 Mar 2024
Incorporating Exponential Smoothing into MLP: A Simple but Effective Sequence Model
Jiqun Chu
Zuoquan Lin
AI4TS
68
2
0
26 Mar 2024
PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models
Jinyi Li
Yihuai Lan
Lei Wang
Hao Wang
53
0
0
26 Mar 2024
ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching
Youpeng Zhao
Di Wu
Jun Wang
96
28
0
26 Mar 2024
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions
Yuda Song
Zehao Sun
Xuanwu Yin
VLM
89
18
0
25 Mar 2024
L-MAE: Longitudinal masked auto-encoder with time and severity-aware encoding for diabetic retinopathy progression prediction
Rachid Zeghlache
Pierre-Henri Conze
Mostafa EL HABIB DAHO
Yi-Hsuan Li
Alireza Rezaei
...
Pascale Massin
B. Cochener
Ikram Brahim
G. Quellec
M. Lamard
67
0
0
24 Mar 2024
Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection
Mohammad Mahmudul Alam
Edward Raff
Stella Biderman
Tim Oates
James Holt
AAML
90
4
0
23 Mar 2024
Previous
1
2
3
...
18
19
20
...
29
30
31
Next