Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.14135
Cited By
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
27 May 2022
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness"
50 / 1,439 papers shown
Title
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Tsendsuren Munkhdalai
Manaal Faruqui
Siddharth Gopal
LRM
LLMAG
CLL
91
106
0
10 Apr 2024
FiP: a Fixed-Point Approach for Causal Generative Modeling
M. Scetbon
Joel Jennings
Agrin Hilmkil
Cheng Zhang
Chao Ma
45
2
0
10 Apr 2024
Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation
Thomas Merth
Qichen Fu
Mohammad Rastegari
Mahyar Najibi
LRM
RALM
39
9
0
10 Apr 2024
CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers
Longwei Zou
Qingyang Wang
Han Zhao
Jiangang Kong
Yi Yang
Yangdong Deng
52
0
0
10 Apr 2024
Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks
Chonghua Wang
Haodong Duan
Songyang Zhang
Dahua Lin
Kai-xiang Chen
ELM
31
17
0
09 Apr 2024
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Bo Peng
Daniel Goldstein
Quentin G. Anthony
Alon Albalak
Eric Alcaide
...
Bingchen Zhao
Qihang Zhao
Peng Zhou
Jian Zhu
Ruijie Zhu
56
73
0
08 Apr 2024
Xiwu: A Basis Flexible and Learnable LLM for High Energy Physics
Zhengde Zhang
Yiyu Zhang
Haodong Yao
Jianwen Luo
Rui Zhao
...
Ke Li
Lina Zhao
Jun Cao
Fazhi Qi
Changzheng Yuan
40
2
0
08 Apr 2024
Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts
Weilin Cai
Juyong Jiang
Le Qin
Junwei Cui
Sunghun Kim
Jiayi Huang
64
7
0
07 Apr 2024
MemFlow: Optical Flow Estimation and Prediction with Memory
Qiaole Dong
Yanwei Fu
30
19
0
07 Apr 2024
Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval
Joao Coelho
Bruno Martins
João Magalhães
Jamie Callan
Chenyan Xiong
RALM
65
5
0
05 Apr 2024
Sailor: Open Language Models for South-East Asia
Longxu Dou
Qian Liu
Guangtao Zeng
Jia Guo
Jiahui Zhou
Wei Lu
Min Lin
LRM
45
9
0
04 Apr 2024
VIAssist: Adapting Multi-modal Large Language Models for Users with Visual Impairments
Bufang Yang
Lixing He
Kaiwei Liu
Zhenyu Yan
45
20
0
03 Apr 2024
Enhancing Human-Computer Interaction in Chest X-ray Analysis using Vision and Language Model with Eye Gaze Patterns
Yunsoo Kim
Jinge Wu
Yusuf Abdulle
Yue Gao
Honghan Wu
50
3
0
03 Apr 2024
Linear Attention Sequence Parallelism
Weigao Sun
Zhen Qin
Dong Li
Xuyang Shen
Yu Qiao
Yiran Zhong
76
2
0
03 Apr 2024
Emergent Abilities in Reduced-Scale Generative Language Models
Sherin Muckatira
Vijeta Deshpande
Vladislav Lialin
Anna Rumshisky
ReLM
ELM
LRM
41
4
0
02 Apr 2024
HyperCLOVA X Technical Report
Kang Min Yoo
Jaegeun Han
Sookyo In
Heewon Jeon
Jisu Jeong
...
Hyunkyung Noh
Se-Eun Choi
Sang-Woo Lee
Jung Hwa Lim
Nako Sung
VLM
45
8
0
02 Apr 2024
Position-Aware Parameter Efficient Fine-Tuning Approach for Reducing Positional Bias in LLMs
Zheng Zhang
Fan Yang
Ziyan Jiang
Zheng Chen
Zhengyang Zhao
Chengyuan Ma
Liang Zhao
Yang Liu
34
5
0
01 Apr 2024
Stream of Search (SoS): Learning to Search in Language
Kanishk Gandhi
Denise Lee
Gabriel Grand
Muxin Liu
Winson Cheng
Archit Sharma
Noah D. Goodman
RALM
AIFin
LRM
52
47
0
01 Apr 2024
On Difficulties of Attention Factorization through Shared Memory
Uladzislau Yorsh
Martin Holevna
Ondrej Bojar
David Herel
31
0
0
31 Mar 2024
DailyMAE: Towards Pretraining Masked Autoencoders in One Day
Jiantao Wu
Shentong Mo
Sara Atito
Zhenhua Feng
Josef Kittler
Muhammad Awais
48
3
0
31 Mar 2024
QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
Saleh Ashkboos
Amirkeivan Mohtashami
Maximilian L. Croci
Bo Li
Martin Jaggi
Dan Alistarh
Torsten Hoefler
James Hensman
MQ
42
144
0
30 Mar 2024
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Taishi Nakamura
Mayank Mishra
Simone Tedeschi
Yekun Chai
Jason T Stillerman
...
Virendra Mehta
Matthew Blumberg
Victor May
Huu Nguyen
S. Pyysalo
LRM
53
7
0
30 Mar 2024
Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks
Hyunjae Kim
Hyeon Hwang
Jiwoo Lee
Sihyeon Park
Dain Kim
Taewhoo Lee
Chanwoong Yoon
Jiwoong Sohn
Donghee Choi
Jaewoo Kang
ELM
AI4MH
LRM
72
19
0
30 Mar 2024
DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference
Jinwei Yao
Kaiqi Chen
Kexun Zhang
Jiaxuan You
Binhang Yuan
Zeke Wang
Tao Lin
50
3
0
30 Mar 2024
Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference
Jovan Stojkovic
Esha Choukse
Chaojie Zhang
Inigo Goiri
Josep Torrellas
43
37
0
29 Mar 2024
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs
Luchang Li
Sheng Qian
Jie Lu
Lunxi Yuan
Rui Wang
Qin Xie
54
9
0
29 Mar 2024
CtRL-Sim: Reactive and Controllable Driving Agents with Offline Reinforcement Learning
Luke Rowe
Roger Girgis
Anthony Gosselin
Bruno Carrez
Florian Golemo
Felix Heide
Liam Paull
Christopher Pal
53
4
0
29 Mar 2024
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Elliot Bolton
Abhinav Venigalla
Michihiro Yasunaga
David Leo Wright Hall
Betty Xiong
...
R. Daneshjou
Jonathan Frankle
Percy Liang
Michael Carbin
Christopher D. Manning
LM&MA
MedIm
37
52
0
27 Mar 2024
Integrative Graph-Transformer Framework for Histopathology Whole Slide Image Representation and Classification
Zhan Shi
Jingwei Zhang
Jun Kong
Fusheng Wang
MedIm
51
3
0
26 Mar 2024
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
Rui Pan
Xiang Liu
Shizhe Diao
Renjie Pi
Jipeng Zhang
Chi Han
Tong Zhang
48
38
0
26 Mar 2024
ArabicaQA: A Comprehensive Dataset for Arabic Question Answering
Abdelrahman Abdallah
M. Kasem
Mahmoud Abdalla
Mohamed Mahmoud
Mohamed Elkasaby
Yasser Elbendary
Adam Jatowt
RALM
36
13
0
26 Mar 2024
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Chenhongyi Yang
Zehui Chen
Miguel Espinosa
Linus Ericsson
Zhenyu Wang
Jiaming Liu
Elliot J. Crowley
Mamba
44
90
0
26 Mar 2024
Incorporating Exponential Smoothing into MLP: A Simple but Effective Sequence Model
Jiqun Chu
Zuoquan Lin
AI4TS
42
2
0
26 Mar 2024
PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models
Jinyi Li
Yihuai Lan
Lei Wang
Hao Wang
35
0
0
26 Mar 2024
ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching
Youpeng Zhao
Di Wu
Jun Wang
35
26
0
26 Mar 2024
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions
Yuda Song
Zehao Sun
Xuanwu Yin
VLM
51
17
0
25 Mar 2024
L-MAE: Longitudinal masked auto-encoder with time and severity-aware encoding for diabetic retinopathy progression prediction
Rachid Zeghlache
Pierre-Henri Conze
Mostafa EL HABIB DAHO
Yi-Hsuan Li
Alireza Rezaei
...
Pascale Massin
B. Cochener
Ikram Brahim
G. Quellec
M. Lamard
41
0
0
24 Mar 2024
Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection
Mohammad Mahmudul Alam
Edward Raff
Stella Biderman
Tim Oates
James Holt
AAML
43
3
0
23 Mar 2024
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding
Yi Wang
Kunchang Li
Xinhao Li
Jiashuo Yu
Yinan He
...
Hongjie Zhang
Yifei Huang
Yu Qiao
Yali Wang
Limin Wang
44
52
0
22 Mar 2024
LimGen: Probing the LLMs for Generating Suggestive Limitations of Research Papers
Abdur Rahman Bin Md Faizullah
Ashok Urlana
Rahul Mishra
41
4
0
22 Mar 2024
Hierarchical Skip Decoding for Efficient Autoregressive Text Generation
Yunqi Zhu
Xuebing Yang
Yuanyuan Wu
Wensheng Zhang
47
3
0
22 Mar 2024
ZigMa: A DiT-style Zigzag Mamba Diffusion Model
Vincent Tao Hu
S. A. Baumann
Ming Gui
Olga Grebenkova
Pingchuan Ma
Johannes S. Fischer
Bjorn Ommer
47
42
0
20 Mar 2024
Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts
Guangzeng Han
Weisi Liu
Xiaolei Huang
Brian Borsari
41
21
0
20 Mar 2024
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Yaowei Zheng
Richong Zhang
Junhao Zhang
Yanhan Ye
Zheyan Luo
Zhangchi Feng
Yongqiang Ma
55
416
0
20 Mar 2024
Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks
Bo-Ru Lu
Nikita Haduong
Chien-Yu Lin
Hao Cheng
Noah A. Smith
Mari Ostendorf
AI4CE
40
0
0
19 Mar 2024
MELTing point: Mobile Evaluation of Language Transformers
Stefanos Laskaridis
Kleomenis Katevas
Lorenzo Minto
Hamed Haddadi
31
23
0
19 Mar 2024
WaterVG: Waterway Visual Grounding based on Text-Guided Vision and mmWave Radar
Runwei Guan
Liye Jia
Fengyufan Yang
Shanliang Yao
Erick Purwanto
...
Eng Gee Lim
Jeremy S. Smith
Ka Lok Man
Xuming Hu
Yutao Yue
47
9
0
19 Mar 2024
HCPM: Hierarchical Candidates Pruning for Efficient Detector-Free Matching
Ying Chen
Yong-Jin Liu
Kai Wu
Qiang Nie
Shang Xu
Huifang Ma
Bing Wang
Chengjie Wang
VLM
45
1
0
19 Mar 2024
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
Haocheng Xi
Yuxiang Chen
Kang Zhao
Kaijun Zheng
Jianfei Chen
Jun Zhu
MQ
52
21
0
19 Mar 2024
HDLdebugger: Streamlining HDL debugging with Large Language Models
Xufeng Yao
Haoyang Li
T. H. Chan
Wenyi Xiao
Mingxuan Yuan
Yu Huang
Lei Chen
Bei Yu
24
20
0
18 Mar 2024
Previous
1
2
3
...
16
17
18
...
27
28
29
Next