Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.03426
Cited By
GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values
6 November 2023
Farnoosh Javadi
Walid Ahmed
Habib Hajimolahoseini
Foozhan Ataiefard
Mohammad Hassanpour
Saina Asani
Austin Wen
Omar Mohamed Awad
Kangling Liu
Yang Liu
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values"
6 / 6 papers shown
Title
Beyond Uniform Query Distribution: Key-Driven Grouped Query Attention
Zohaib Khan
Muhammad Khaquan
Omer Tafveez
Burhanuddin Samiwala
Agha Ali Raza
40
3
0
15 Aug 2024
Accelerating the Low-Rank Decomposed Models
Habib Hajimolahoseini
Walid Ahmed
Austin Wen
Yang Liu
19
0
0
24 Jul 2024
Is 3D Convolution with 5D Tensors Really Necessary for Video Analysis?
Habib Hajimolahoseini
Walid Ahmed
Austin Wen
Yang Liu
29
0
0
23 Jul 2024
QCQA: Quality and Capacity-aware grouped Query Attention
Vinay Joshi
Prashant Laddha
Shambhavi Sinha
O. J. Omer
S. Subramoney
24
5
0
08 Jun 2024
SkipViT: Speeding Up Vision Transformers with a Token-Level Skip Connection
Foozhan Ataiefard
Walid Ahmed
Habib Hajimolahoseini
Saina Asani
Farnoosh Javadi
Mohammad Hassanpour
Omar Mohamed Awad
Austin Wen
Kangling Liu
Yang Liu
ViT
30
3
0
27 Jan 2024
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,858
0
18 Apr 2021
1