Mapping of attention mechanisms to a generalized Potts model

Mapping of attention mechanisms to a generalized Potts model

14 April 2023

Federica Gerace

Alessandro Laio

Sebastian Goldt

Papers citing "Mapping of attention mechanisms to a generalized Potts model"

13 / 13 papers shown

Title
Capturing AI's Attention: Physics of Repetition, Hallucination, Bias and Beyond Frank Yingjie Huo Neil F. Johnson 65 1 0 06 Apr 2025
A distributional simplicity bias in the learning dynamics of transformers Riccardo Rende Federica Gerace Alessandro Laio Sebastian Goldt 81 8 0 17 Feb 2025
Bilinear Sequence Regression: A Model for Learning from Long Sequences of High-dimensional Tokens Vittorio Erba Emanuele Troiani Luca Biggio Antoine Maillard Lenka Zdeborová 28 0 0 24 Oct 2024
Optimal Protocols for Continual Learning via Statistical Physics and Control Theory Francesco Mori Stefano Sarao Mannelli Francesca Mignacco 38 3 0 26 Sep 2024
Self-attention as an attractor network: transient memories without backpropagation Francesco DÁmico Matteo Negri 33 2 0 24 Sep 2024
How transformers learn structured data: insights from hierarchical filtering Jerome Garnier-Brun Marc Mézard Emanuele Moscato Luca Saglietti 37 5 0 27 Aug 2024
Dynamical Mean-Field Theory of Self-Attention Neural Networks Ángel Poc-López Miguel Aguilera AI4CE 32 0 0 11 Jun 2024
Are queries and keys always relevant? A case study on Transformer wave functions Riccardo Rende Luciano Loris Viteritti 31 5 0 29 May 2024
Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers Lorenzo Tiberi Francesca Mignacco Kazuki Irie H. Sompolinsky 46 6 0 24 May 2024
Anchor function: a type of benchmark functions for studying language models Zhongwang Zhang Zhiwei Wang Junjie Yao Zhangchen Zhou Xiaolong Li E. Weinan Z. Xu 48 6 0 16 Jan 2024
Eight challenges in developing theory of intelligence Haiping Huang 26 6 0 20 Jun 2023
Fluctuation based interpretable analysis scheme for quantum many-body snapshots Henning Schlomer A. Bohrdt 24 4 0 12 Apr 2023
Data-driven emergence of convolutional structure in neural networks Alessandro Ingrosso Sebastian Goldt 61 38 0 01 Feb 2022