Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.15105
Cited By
Mechanistic evaluation of Transformers and state space models
21 May 2025
Aryaman Arora
Neil Rathi
Nikil Roashan Selvam
Róbert Csordás
Dan Jurafsky
Christopher Potts
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mechanistic evaluation of Transformers and state space models"
11 / 11 papers shown
Title
Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Aviv Bick
Eric P. Xing
Albert Gu
RALM
108
1
0
22 Apr 2025
(How) Do Language Models Track State?
Belinda Z. Li
Zifan Carl Guo
Jacob Andreas
LRM
71
2
0
04 Mar 2025
Which Attention Heads Matter for In-Context Learning?
Kayo Yin
Jacob Steinhardt
53
10
0
19 Feb 2025
Associative Recurrent Memory Transformer
Ivan Rodkin
Yuri Kuratov
Aydar Bulatov
Andrey Kravchenko
102
3
0
17 Feb 2025
An Empirical Study of Mamba-based Language Models
R. Waleffe
Wonmin Byeon
Duncan Riach
Brandon Norick
V. Korthikanti
...
Vartika Singh
Jared Casper
Jan Kautz
Mohammad Shoeybi
Bryan Catanzaro
94
72
0
12 Jun 2024
gzip Predicts Data-dependent Scaling Laws
Rohan Pandey
59
11
0
26 May 2024
The mechanistic basis of data dependence and abrupt learning in an in-context classification task
Gautam Reddy
67
59
0
03 Dec 2023
Physics of Language Models: Part 1, Learning Hierarchical Language Structures
Zeyuan Allen-Zhu
Yuanzhi Li
88
18
0
23 May 2023
A Theory of Emergent In-Context Learning as Implicit Structure Induction
Michael Hahn
Navin Goyal
LRM
45
79
0
14 Mar 2023
Examining the Inductive Bias of Neural Language Models with Artificial Languages
Jennifer C. White
Ryan Cotterell
52
44
0
02 Jun 2021
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
526
129,831
0
12 Jun 2017
1