Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.08691
Cited By
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
17 July 2023
Tri Dao
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning"
30 / 230 papers shown
Title
SaulLM-7B: A pioneering Large Language Model for Law
Pierre Colombo
T. Pires
Malik Boudiaf
Dominic Culver
Rui Melo
...
Andre F. T. Martins
Fabrizio Esposito
Vera Lúcia Raposo
Sofia Morgado
Michael Desa
ELM
AILaw
52
66
0
06 Mar 2024
Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding
Zhenyu Zhang
Runjin Chen
Shiwei Liu
Zhewei Yao
Olatunji Ruwase
Beidi Chen
Xiaoxia Wu
Zhangyang Wang
34
26
0
05 Mar 2024
Stable LM 2 1.6B Technical Report
Marco Bellagente
J. Tow
Dakota Mahan
Duy Phung
Maksym Zhuravinskyi
...
Paulo Rocha
Harry Saini
H. Teufel
Niccoló Zanichelli
Carlos Riquelme
OSLM
49
52
0
27 Feb 2024
Training-Free Long-Context Scaling of Large Language Models
Chen An
Fei Huang
Jun Zhang
Shansan Gong
Xipeng Qiu
Chang Zhou
Lingpeng Kong
ALM
LRM
40
35
0
27 Feb 2024
Investigating the Effectiveness of HyperTuning via Gisting
Jason Phang
46
0
0
26 Feb 2024
Seamless Human Motion Composition with Blended Positional Encodings
Germán Barquero
Sergio Escalera
Cristina Palmero
DiffM
51
29
0
23 Feb 2024
RelayAttention for Efficient Large Language Model Serving with Long System Prompts
Lei Zhu
Xinjiang Wang
Wayne Zhang
Rynson W. H. Lau
33
6
0
22 Feb 2024
Analysing The Impact of Sequence Composition on Language Model Pre-Training
Yu Zhao
Yuanbin Qu
Konrad Staniszewski
Szymon Tworkowski
Wei Liu
Piotr Milo's
Yuxiang Wu
Pasquale Minervini
34
14
0
21 Feb 2024
LEIA: Facilitating Cross-lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation
Ikuya Yamada
Ryokan Ri
KELM
25
0
0
18 Feb 2024
On the Efficacy of Eviction Policy for Key-Value Constrained Generative Language Model Inference
Siyu Ren
Kenny Q. Zhu
26
27
0
09 Feb 2024
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Xing Han Lù
Zdeněk Kasner
Siva Reddy
34
60
0
08 Feb 2024
Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward
Arnav Chavan
Raghav Magazine
Shubham Kushwaha
M. Debbah
Deepak Gupta
18
18
0
02 Feb 2024
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
41
1
0
01 Feb 2024
Institutional Platform for Secure Self-Service Large Language Model Exploration
V. Bumgardner
Mitchell A. Klusty
W. V. Logan
Samuel E. Armstrong
Caylin D. Hickey
Jeff Talbert
Caylin Hickey
Jeff Talbert
58
1
0
01 Feb 2024
TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese
N. Corrêa
Sophia Falk
Shiza Fatimah
Aniket Sen
N. D. Oliveira
30
9
0
30 Jan 2024
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models
Haochen Tan
Zhijiang Guo
Zhan Shi
Lu Xu
Zhili Liu
...
Xiaoguang Li
Yasheng Wang
Lifeng Shang
Qun Liu
Linqi Song
43
12
0
26 Jan 2024
Rethinking Patch Dependence for Masked Autoencoders
Letian Fu
Long Lian
Renhao Wang
Baifeng Shi
Xudong Wang
Adam Yala
Trevor Darrell
Alexei A. Efros
Ken Goldberg
39
14
0
25 Jan 2024
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
Zongxin Yang
Guikun Chen
Xiaodi Li
Wenguan Wang
Yi Yang
LM&Ro
LLMAG
69
35
0
16 Jan 2024
Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding
Heming Xia
Zhe Yang
Qingxiu Dong
Peiyi Wang
Yongqi Li
Tao Ge
Tianyu Liu
Wenjie Li
Zhifang Sui
LRM
38
101
0
15 Jan 2024
Run LoRA Run: Faster and Lighter LoRA Implementations
Daria Cherniuk
A. Mikhalev
Ivan Oseledets
AI4CE
22
1
0
06 Dec 2023
The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning
Bill Yuchen Lin
Abhilasha Ravichander
Ximing Lu
Nouha Dziri
Melanie Sclar
Khyathi Raghavi Chandu
Chandra Bhagavatula
Yejin Choi
22
166
0
04 Dec 2023
Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model
Yen-Ting Lin
Yun-Nung Chen
40
20
0
29 Nov 2023
Generative Judge for Evaluating Alignment
Junlong Li
Shichao Sun
Weizhe Yuan
Run-Ze Fan
Hai Zhao
Pengfei Liu
ELM
ALM
35
79
0
09 Oct 2023
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Yukang Chen
Shengju Qian
Haotian Tang
Xin Lai
Zhijian Liu
Song Han
Jiaya Jia
56
152
0
21 Sep 2023
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Zenan Zhou
Zhiying Wu
ELM
LRM
77
710
0
19 Sep 2023
Local Large Language Models for Complex Structured Medical Tasks
V. Bumgardner
Aaron D. Mullen
Samuel E. Armstrong
Caylin D. Hickey
Jeffrey A. Talbert
36
5
0
03 Aug 2023
Ray-Patch: An Efficient Querying for Light Field Transformers
T. B. Martins
Javier Civera
ViT
39
0
0
16 May 2023
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
288
2,023
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
252
580
0
12 Mar 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,833
0
17 Sep 2019
Previous
1
2
3
4
5