Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.15138
Cited By
How transformers learn structured data: insights from hierarchical filtering
27 August 2024
Jerome Garnier-Brun
Marc Mézard
Emanuele Moscato
Luca Saglietti
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How transformers learn structured data: insights from hierarchical filtering"
4 / 4 papers shown
Title
Scaling Laws and Representation Learning in Simple Hierarchical Languages: Transformers vs. Convolutional Architectures
Francesco Cagnetta
Alessandro Favero
Antonio Sclocchi
M. Wyart
26
0
0
11 May 2025
Learning curves theory for hierarchically compositional data with power-law distributed features
Francesco Cagnetta
Hyunmo Kang
M. Wyart
36
0
0
11 May 2025
A distributional simplicity bias in the learning dynamics of transformers
Riccardo Rende
Federica Gerace
A. Laio
Sebastian Goldt
73
8
0
17 Feb 2025
Probing the Latent Hierarchical Structure of Data via Diffusion Models
Antonio Sclocchi
Alessandro Favero
Noam Itzhak Levi
M. Wyart
DiffM
33
3
0
17 Oct 2024
1