Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.12362
Cited By
Transformer tricks: Removing weights for skipless transformers
18 April 2024
Nils Graef
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Transformer tricks: Removing weights for skipless transformers"
2 / 2 papers shown
Title
Flash normalization: fast normalization for LLMs
Nils Graef
Matthew Clapp
Andrew Wasielewski
21
0
0
12 Jul 2024
Transformer tricks: Precomputing the first layer
Nils Graef
MoE
29
4
0
20 Feb 2024
1