ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.02579
  4. Cited By
Geometric Dynamics of Signal Propagation Predict Trainability of
  Transformers

Geometric Dynamics of Signal Propagation Predict Trainability of Transformers

5 March 2024
Aditya Cowsik
Tamra M. Nebabu
Xiao-Liang Qi
Surya Ganguli
ArXivPDFHTML

Papers citing "Geometric Dynamics of Signal Propagation Predict Trainability of Transformers"

10 / 10 papers shown
Title
Bridging the Dimensional Chasm: Uncover Layer-wise Dimensional Reduction in Transformers through Token Correlation
Bridging the Dimensional Chasm: Uncover Layer-wise Dimensional Reduction in Transformers through Token Correlation
Zhuo-Yang Song
Zeyu Li
Qing-Hong Cao
Ming-xing Luo
Hua Xing Zhu
35
0
0
28 Mar 2025
The Geometry of Tokens in Internal Representations of Large Language Models
The Geometry of Tokens in Internal Representations of Large Language Models
Karthik Viswanathan
Yuri Gardinazzi
Giada Panerai
Alberto Cazzaniga
Matteo Biagetti
AIFin
94
4
0
17 Jan 2025
Clustering in Causal Attention Masking
Clustering in Causal Attention Masking
Nikita Karagodin
Yury Polyanskiy
Philippe Rigollet
60
5
0
07 Nov 2024
Emergence of meta-stable clustering in mean-field transformer models
Emergence of meta-stable clustering in mean-field transformer models
Giuseppe Bruno
Federico Pasqualotto
Andrea Agazzi
45
6
0
30 Oct 2024
Bilinear Sequence Regression: A Model for Learning from Long Sequences
  of High-dimensional Tokens
Bilinear Sequence Regression: A Model for Learning from Long Sequences of High-dimensional Tokens
Vittorio Erba
Emanuele Troiani
Luca Biggio
Antoine Maillard
Lenka Zdeborová
23
0
0
24 Oct 2024
Graph Neural Networks Do Not Always Oversmooth
Graph Neural Networks Do Not Always Oversmooth
Bastian Epping
Alexandre René
M. Helias
Michael T. Schaub
38
3
0
04 Jun 2024
Infinite Limits of Multi-head Transformer Dynamics
Infinite Limits of Multi-head Transformer Dynamics
Blake Bordelon
Hamza Tahir Chaudhry
C. Pehlevan
AI4CE
47
9
0
24 May 2024
Towards Understanding Inductive Bias in Transformers: A View From
  Infinity
Towards Understanding Inductive Bias in Transformers: A View From Infinity
Itay Lavie
Guy Gur-Ari
Z. Ringel
34
1
0
07 Feb 2024
Rapid training of deep neural networks without skip connections or
  normalization layers using Deep Kernel Shaping
Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping
James Martens
Andy Ballard
Guillaume Desjardins
G. Swirszcz
Valentin Dalibard
Jascha Narain Sohl-Dickstein
S. Schoenholz
88
43
0
05 Oct 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
227
2,430
0
04 Jan 2021
1