ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.00048
  4. Cited By
Towards a theory of how the structure of language is acquired by deep
  neural networks

Towards a theory of how the structure of language is acquired by deep neural networks

28 May 2024
Francesco Cagnetta
Matthieu Wyart
ArXivPDFHTML

Papers citing "Towards a theory of how the structure of language is acquired by deep neural networks"

16 / 16 papers shown
Title
Learning curves theory for hierarchically compositional data with power-law distributed features
Learning curves theory for hierarchically compositional data with power-law distributed features
Francesco Cagnetta
Hyunmo Kang
Matthieu Wyart
98
1
0
11 May 2025
A distributional simplicity bias in the learning dynamics of transformers
A distributional simplicity bias in the learning dynamics of transformers
Riccardo Rende
Federica Gerace
Alessandro Laio
Sebastian Goldt
107
8
0
17 Feb 2025
Bilinear Sequence Regression: A Model for Learning from Long Sequences of High-dimensional Tokens
Bilinear Sequence Regression: A Model for Learning from Long Sequences of High-dimensional Tokens
Vittorio Erba
Emanuele Troiani
Luca Biggio
Antoine Maillard
Lenka Zdeborová
156
1
0
24 Oct 2024
Probing the Latent Hierarchical Structure of Data via Diffusion Models
Probing the Latent Hierarchical Structure of Data via Diffusion Models
Antonio Sclocchi
Alessandro Favero
Noam Itzhak Levi
Matthieu Wyart
DiffM
82
5
0
17 Oct 2024
A Dynamical Model of Neural Scaling Laws
A Dynamical Model of Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
Cengiz Pehlevan
89
41
0
02 Feb 2024
Large Language Models
Large Language Models
Michael R Douglas
LLMAG
LM&MA
135
625
0
11 Jul 2023
Autocorrelations Decay in Texts and Applicability Limits of Language
  Models
Autocorrelations Decay in Texts and Applicability Limits of Language Models
N. Mikhaylovskiy
I. Churilov
21
6
0
11 May 2023
Emergent Abilities of Large Language Models
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Percy Liang
J. Dean
W. Fedus
ELM
ReLM
LRM
277
2,474
0
15 Jun 2022
Explaining Neural Scaling Laws
Explaining Neural Scaling Laws
Yasaman Bahri
Ethan Dyer
Jared Kaplan
Jaehoon Lee
Utkarsh Sharma
62
261
0
12 Feb 2021
Feature Learning in Infinite-Width Neural Networks
Feature Learning in Infinite-Width Neural Networks
Greg Yang
J. E. Hu
MLT
75
153
0
30 Nov 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
602
4,801
0
23 Jan 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
493
42,407
0
03 Dec 2019
BERT Rediscovers the Classical NLP Pipeline
BERT Rediscovers the Classical NLP Pipeline
Ian Tenney
Dipanjan Das
Ellie Pavlick
MILM
SSeg
135
1,471
0
15 May 2019
Emergence of order in random languages
Emergence of order in random languages
Eric De Giuli
LRM
20
10
0
20 Feb 2019
Dissecting Contextual Word Embeddings: Architecture and Representation
Dissecting Contextual Word Embeddings: Architecture and Representation
Matthew E. Peters
Mark Neumann
Luke Zettlemoyer
Wen-tau Yih
96
429
0
27 Aug 2018
Pointer Sentinel Mixture Models
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
308
2,859
0
26 Sep 2016
1