Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.02579
Cited By
Geometric Dynamics of Signal Propagation Predict Trainability of Transformers
5 March 2024
Aditya Cowsik
Tamra M. Nebabu
Xiao-Liang Qi
Surya Ganguli
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Geometric Dynamics of Signal Propagation Predict Trainability of Transformers"
10 / 10 papers shown
Title
Bridging the Dimensional Chasm: Uncover Layer-wise Dimensional Reduction in Transformers through Token Correlation
Zhuo-Yang Song
Zeyu Li
Qing-Hong Cao
Ming-xing Luo
Hua Xing Zhu
35
0
0
28 Mar 2025
The Geometry of Tokens in Internal Representations of Large Language Models
Karthik Viswanathan
Yuri Gardinazzi
Giada Panerai
Alberto Cazzaniga
Matteo Biagetti
AIFin
94
4
0
17 Jan 2025
Clustering in Causal Attention Masking
Nikita Karagodin
Yury Polyanskiy
Philippe Rigollet
60
5
0
07 Nov 2024
Emergence of meta-stable clustering in mean-field transformer models
Giuseppe Bruno
Federico Pasqualotto
Andrea Agazzi
45
6
0
30 Oct 2024
Bilinear Sequence Regression: A Model for Learning from Long Sequences of High-dimensional Tokens
Vittorio Erba
Emanuele Troiani
Luca Biggio
Antoine Maillard
Lenka Zdeborová
23
0
0
24 Oct 2024
Graph Neural Networks Do Not Always Oversmooth
Bastian Epping
Alexandre René
M. Helias
Michael T. Schaub
38
3
0
04 Jun 2024
Infinite Limits of Multi-head Transformer Dynamics
Blake Bordelon
Hamza Tahir Chaudhry
C. Pehlevan
AI4CE
47
9
0
24 May 2024
Towards Understanding Inductive Bias in Transformers: A View From Infinity
Itay Lavie
Guy Gur-Ari
Z. Ringel
34
1
0
07 Feb 2024
Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping
James Martens
Andy Ballard
Guillaume Desjardins
G. Swirszcz
Valentin Dalibard
Jascha Narain Sohl-Dickstein
S. Schoenholz
88
43
0
05 Oct 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
227
2,430
0
04 Jan 2021
1