Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.01128
Cited By
Learning Transformer Programs
1 June 2023
Dan Friedman
Alexander Wettig
Danqi Chen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Transformer Programs"
26 / 26 papers shown
Title
Understanding the Logic of Direct Preference Alignment through Logic
Kyle Richardson
Vivek Srikumar
Ashish Sabharwal
85
2
0
23 Dec 2024
Quantifying artificial intelligence through algebraic generalization
Takuya Ito
Murray Campbell
L. Horesh
Tim Klinger
Parikshit Ram
ELM
53
0
0
08 Nov 2024
Hypothesis Testing the Circuit Hypothesis in LLMs
Claudia Shi
Nicolas Beltran-Velez
Achille Nazaret
Carolina Zheng
Adrià Garriga-Alonso
Andrew Jesson
Maggie Makar
David M. Blei
45
6
0
16 Oct 2024
A mechanistically interpretable neural network for regulatory genomics
Alex Tseng
Gökçen Eraslan
Tommaso Biancalani
Gabriele Scalia
31
0
0
08 Oct 2024
Integration of Mamba and Transformer -- MAT for Long-Short Range Time Series Forecasting with Application to Weather Dynamics
Wenqing Zhang
Junming Huang
Ruotong Wang
Changsong Wei
Wenqian Huang
Yuxin Qiao
Mamba
40
10
0
13 Sep 2024
How Transformers Utilize Multi-Head Attention in In-Context Learning? A Case Study on Sparse Linear Regression
Xingwu Chen
Lei Zhao
Difan Zou
49
6
0
08 Aug 2024
Representing Rule-based Chatbots with Transformers
Dan Friedman
Abhishek Panigrahi
Danqi Chen
71
1
0
15 Jul 2024
Algorithmic Language Models with Neurally Compiled Libraries
Lucas Saldyt
Subbarao Kambhampati
LRM
62
0
0
06 Jul 2024
Finding Transformer Circuits with Edge Pruning
Adithya Bhaskar
Alexander Wettig
Dan Friedman
Danqi Chen
68
17
0
24 Jun 2024
A Philosophical Introduction to Language Models - Part II: The Way Forward
Raphael Milliere
Cameron Buckner
LRM
66
14
0
06 May 2024
Mechanistic Interpretability for AI Safety -- A Review
Leonard Bereska
E. Gavves
AI4CE
40
117
0
22 Apr 2024
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks
Xingwu Chen
Difan Zou
ViT
26
12
0
02 Apr 2024
Discrete Neural Algorithmic Reasoning
Gleb Rodionov
Liudmila Prokhorenkova
OOD
NAI
44
3
0
18 Feb 2024
Towards Uncovering How Large Language Model Works: An Explainability Perspective
Haiyan Zhao
Fan Yang
Bo Shen
Himabindu Lakkaraju
Mengnan Du
35
10
0
16 Feb 2024
PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition
Jinghui Lu
Ziwei Yang
Yanjie Wang
Xuejing Liu
Brian Mac Namee
Can Huang
MoE
53
4
0
07 Feb 2024
Simulation of Graph Algorithms with Looped Transformers
Artur Back de Luca
K. Fountoulakis
58
14
0
02 Feb 2024
What Formal Languages Can Transformers Express? A Survey
Lena Strobl
William Merrill
Gail Weiss
David Chiang
Dana Angluin
AI4CE
20
48
0
01 Nov 2023
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
Alex Tamkin
Mohammad Taufeeque
Noah D. Goodman
35
27
0
26 Oct 2023
Masked Hard-Attention Transformers Recognize Exactly the Star-Free Languages
Andy Yang
David Chiang
Dana Angluin
30
15
0
21 Oct 2023
Large Language Models
Michael R Douglas
LLMAG
LM&MA
54
564
0
11 Jul 2023
Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations
Atticus Geiger
Zhengxuan Wu
Christopher Potts
Thomas Icard
Noah D. Goodman
CML
75
99
0
05 Mar 2023
Tracr: Compiled Transformers as a Laboratory for Interpretability
David Lindner
János Kramár
Sebastian Farquhar
Matthew Rahtz
Tom McGrath
Vladimir Mikulik
29
72
0
12 Jan 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
212
497
0
01 Nov 2022
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
250
463
0
24 Sep 2022
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
253
698
0
27 Aug 2021
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
226
409
0
24 Feb 2021
1