Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.18724
Cited By
Learning Associative Memories with Gradient Descent
28 February 2024
Vivien A. Cabannes
Berfin Simsek
A. Bietti
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Associative Memories with Gradient Descent"
10 / 10 papers shown
Title
Birth of a Transformer: A Memory Viewpoint
A. Bietti
Vivien A. Cabannes
Diane Bouchacourt
Hervé Jégou
Léon Bottou
48
89
0
01 Jun 2023
Infinite-width limit of deep linear neural networks
Lénaïc Chizat
Maria Colombo
Xavier Fernández-Real
Alessio Figalli
44
14
0
29 Nov 2022
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
296
494
0
24 Sep 2022
Beyond the Edge of Stability via Two-step Gradient Updates
Lei Chen
Joan Bruna
MLT
14
9
0
08 Jun 2022
Transformer Feed-Forward Layers Are Key-Value Memories
Mor Geva
R. Schuster
Jonathan Berant
Omer Levy
KELM
91
792
0
29 Dec 2020
Hopfield Networks is All You Need
Hubert Ramsauer
Bernhard Schafl
Johannes Lehner
Philipp Seidl
Michael Widrich
...
David P. Kreil
Michael K Kopp
Günter Klambauer
Johannes Brandstetter
Sepp Hochreiter
40
424
0
16 Jul 2020
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Kaifeng Lyu
Jian Li
59
332
0
13 Jun 2019
Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm
Qiang Liu
Dilin Wang
BDL
35
1,082
0
16 Aug 2016
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
154
10,412
0
21 Jul 2016
Distributed Representations of Words and Phrases and their Compositionality
Tomas Mikolov
Ilya Sutskever
Kai Chen
G. Corrado
J. Dean
NAI
OCL
189
33,445
0
16 Oct 2013
1