ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.18724
  4. Cited By
Learning Associative Memories with Gradient Descent

Learning Associative Memories with Gradient Descent

28 February 2024
Vivien A. Cabannes
Berfin Simsek
A. Bietti
ArXivPDFHTML

Papers citing "Learning Associative Memories with Gradient Descent"

10 / 10 papers shown
Title
Birth of a Transformer: A Memory Viewpoint
Birth of a Transformer: A Memory Viewpoint
A. Bietti
Vivien A. Cabannes
Diane Bouchacourt
Hervé Jégou
Léon Bottou
54
89
0
01 Jun 2023
Infinite-width limit of deep linear neural networks
Infinite-width limit of deep linear neural networks
Lénaïc Chizat
Maria Colombo
Xavier Fernández-Real
Alessio Figalli
53
14
0
29 Nov 2022
In-context Learning and Induction Heads
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
296
494
0
24 Sep 2022
Beyond the Edge of Stability via Two-step Gradient Updates
Beyond the Edge of Stability via Two-step Gradient Updates
Lei Chen
Joan Bruna
MLT
14
9
0
08 Jun 2022
Transformer Feed-Forward Layers Are Key-Value Memories
Transformer Feed-Forward Layers Are Key-Value Memories
Mor Geva
R. Schuster
Jonathan Berant
Omer Levy
KELM
102
792
0
29 Dec 2020
Hopfield Networks is All You Need
Hopfield Networks is All You Need
Hubert Ramsauer
Bernhard Schafl
Johannes Lehner
Philipp Seidl
Michael Widrich
...
David P. Kreil
Michael K Kopp
Günter Klambauer
Johannes Brandstetter
Sepp Hochreiter
52
424
0
16 Jul 2020
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Kaifeng Lyu
Jian Li
68
332
0
13 Jun 2019
Stein Variational Gradient Descent: A General Purpose Bayesian Inference
  Algorithm
Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm
Qiang Liu
Dilin Wang
BDL
42
1,082
0
16 Aug 2016
Layer Normalization
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
199
10,412
0
21 Jul 2016
Distributed Representations of Words and Phrases and their
  Compositionality
Distributed Representations of Words and Phrases and their Compositionality
Tomas Mikolov
Ilya Sutskever
Kai Chen
G. Corrado
J. Dean
NAI
OCL
242
33,445
0
16 Oct 2013
1