Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.14462
Cited By
v1
v2 (latest)
Towards smaller, faster decoder-only transformers: Architectural variants and their implications
22 April 2024
Sathya Krishnan Suresh
P. Shunmugapriya
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Towards smaller, faster decoder-only transformers: Architectural variants and their implications"
2 / 2 papers shown
Title
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov
Kushal Tirumala
Hassan Shapourian
Paolo Glorioso
Daniel A. Roberts
141
106
0
26 Mar 2024
Fast Transformer Decoding: One Write-Head is All You Need
Noam M. Shazeer
163
478
0
06 Nov 2019
1