Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.10794
Cited By
A mathematical perspective on Transformers
17 December 2023
Borjan Geshkovski
Cyril Letrouit
Yury Polyanskiy
Philippe Rigollet
EDL
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A mathematical perspective on Transformers"
13 / 13 papers shown
Title
Dual Filter: A Mathematical Framework for Inference using Transformer-like Architectures
Heng-Sheng Chang
P. Mehta
39
0
0
01 May 2025
Quantitative Clustering in Mean-Field Transformer Models
Shi Chen
Zhengjiang Lin
Yury Polyanskiy
Philippe Rigollet
38
0
0
20 Apr 2025
Exact Sequence Classification with Hardmax Transformers
Albert Alcalde
Giovanni Fantuzzi
Enrique Zuazua
77
1
0
04 Feb 2025
OT-Transformer: A Continuous-time Transformer Architecture with Optimal Transport Regularization
Kelvin Kan
Xingjian Li
Stanley Osher
99
2
0
30 Jan 2025
The Geometry of Tokens in Internal Representations of Large Language Models
Karthik Viswanathan
Yuri Gardinazzi
Giada Panerai
Alberto Cazzaniga
Matteo Biagetti
AIFin
94
4
0
17 Jan 2025
Emergence of meta-stable clustering in mean-field transformer models
Giuseppe Bruno
Federico Pasqualotto
Andrea Agazzi
45
6
0
30 Oct 2024
Clustering in pure-attention hardmax transformers and its role in sentiment analysis
Albert Alcalde
Giovanni Fantuzzi
Enrique Zuazua
35
3
0
26 Jun 2024
Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers
Lorenzo Tiberi
Francesca Mignacco
Kazuki Irie
H. Sompolinsky
44
6
0
24 May 2024
The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models
Carlo Nicolini
Jacopo Staiano
Bruno Lepri
Raffaele Marino
MoE
34
1
0
13 Mar 2024
How Smooth Is Attention?
Valérie Castin
Pierre Ablin
Gabriel Peyré
AAML
40
9
0
22 Dec 2023
Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems
Subhabrata Dutta
Tanya Gautam
Soumen Chakrabarti
Tanmoy Chakraborty
51
15
0
30 Sep 2021
A Class of Dimension-free Metrics for the Convergence of Empirical Measures
Jiequn Han
Ruimeng Hu
Jihao Long
23
3
0
24 Apr 2021
Trainability and Accuracy of Neural Networks: An Interacting Particle System Approach
Grant M. Rotskoff
Eric Vanden-Eijnden
59
118
0
02 May 2018
1