ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.10794
  4. Cited By
A mathematical perspective on Transformers

A mathematical perspective on Transformers

17 December 2023
Borjan Geshkovski
Cyril Letrouit
Yury Polyanskiy
Philippe Rigollet
    EDL
    AI4CE
ArXivPDFHTML

Papers citing "A mathematical perspective on Transformers"

13 / 13 papers shown
Title
Dual Filter: A Mathematical Framework for Inference using Transformer-like Architectures
Dual Filter: A Mathematical Framework for Inference using Transformer-like Architectures
Heng-Sheng Chang
P. Mehta
39
0
0
01 May 2025
Quantitative Clustering in Mean-Field Transformer Models
Quantitative Clustering in Mean-Field Transformer Models
Shi Chen
Zhengjiang Lin
Yury Polyanskiy
Philippe Rigollet
38
0
0
20 Apr 2025
Exact Sequence Classification with Hardmax Transformers
Exact Sequence Classification with Hardmax Transformers
Albert Alcalde
Giovanni Fantuzzi
Enrique Zuazua
77
1
0
04 Feb 2025
OT-Transformer: A Continuous-time Transformer Architecture with Optimal Transport Regularization
OT-Transformer: A Continuous-time Transformer Architecture with Optimal Transport Regularization
Kelvin Kan
Xingjian Li
Stanley Osher
99
2
0
30 Jan 2025
The Geometry of Tokens in Internal Representations of Large Language Models
The Geometry of Tokens in Internal Representations of Large Language Models
Karthik Viswanathan
Yuri Gardinazzi
Giada Panerai
Alberto Cazzaniga
Matteo Biagetti
AIFin
94
4
0
17 Jan 2025
Emergence of meta-stable clustering in mean-field transformer models
Emergence of meta-stable clustering in mean-field transformer models
Giuseppe Bruno
Federico Pasqualotto
Andrea Agazzi
45
6
0
30 Oct 2024
Clustering in pure-attention hardmax transformers and its role in
  sentiment analysis
Clustering in pure-attention hardmax transformers and its role in sentiment analysis
Albert Alcalde
Giovanni Fantuzzi
Enrique Zuazua
35
3
0
26 Jun 2024
Dissecting the Interplay of Attention Paths in a Statistical Mechanics
  Theory of Transformers
Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers
Lorenzo Tiberi
Francesca Mignacco
Kazuki Irie
H. Sompolinsky
44
6
0
24 May 2024
The Garden of Forking Paths: Observing Dynamic Parameters Distribution
  in Large Language Models
The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models
Carlo Nicolini
Jacopo Staiano
Bruno Lepri
Raffaele Marino
MoE
34
1
0
13 Mar 2024
How Smooth Is Attention?
How Smooth Is Attention?
Valérie Castin
Pierre Ablin
Gabriel Peyré
AAML
40
9
0
22 Dec 2023
Redesigning the Transformer Architecture with Insights from
  Multi-particle Dynamical Systems
Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems
Subhabrata Dutta
Tanya Gautam
Soumen Chakrabarti
Tanmoy Chakraborty
51
15
0
30 Sep 2021
A Class of Dimension-free Metrics for the Convergence of Empirical
  Measures
A Class of Dimension-free Metrics for the Convergence of Empirical Measures
Jiequn Han
Ruimeng Hu
Jihao Long
23
3
0
24 Apr 2021
Trainability and Accuracy of Neural Networks: An Interacting Particle
  System Approach
Trainability and Accuracy of Neural Networks: An Interacting Particle System Approach
Grant M. Rotskoff
Eric Vanden-Eijnden
59
118
0
02 May 2018
1