ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.14491
  4. Cited By
Towards a Mechanistic Interpretation of Multi-Step Reasoning
  Capabilities of Language Models

Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models

23 October 2023
Buse Giledereli
Jiaoda Li
Yu Fei
Alessandro Stolfo
Wangchunshu Zhou
Guangtao Zeng
Antoine Bosselut
Mrinmaya Sachan
    LRM
ArXivPDFHTML

Papers citing "Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models"

34 / 34 papers shown
Title
Mechanistic Unveiling of Transformer Circuits: Self-Influence as a Key to Model Reasoning
Mechanistic Unveiling of Transformer Circuits: Self-Influence as a Key to Model Reasoning
Lefei Zhang
Lijie Hu
Di Wang
LRM
156
4
0
17 Feb 2025
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics
Yaniv Nikankin
Anja Reusch
Aaron Mueller
Yonatan Belinkov
AIFin
LRM
89
32
0
28 Oct 2024
Identifying Sub-networks in Neural Networks via Functionally Similar Representations
Identifying Sub-networks in Neural Networks via Functionally Similar Representations
Tian Gao
Amit Dhurandhar
Karthikeyan N. Ramamurthy
Dennis L. Wei
70
0
0
21 Oct 2024
MIRAGE: Evaluating and Explaining Inductive Reasoning Process in Language Models
MIRAGE: Evaluating and Explaining Inductive Reasoning Process in Language Models
Jiachun Li
Pengfei Cao
Zhuoran Jin
Yubo Chen
Kang Liu
Jun Zhao
LRM
ELM
62
6
0
12 Oct 2024
Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of
  Language Models with Hypothesis Refinement
Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement
Linlu Qiu
Liwei Jiang
Ximing Lu
Melanie Sclar
Valentina Pyatkin
...
Bailin Wang
Yoon Kim
Yejin Choi
Nouha Dziri
Xiang Ren
LRM
ReLM
73
85
0
12 Oct 2023
Grokking of Hierarchical Structure in Vanilla Transformers
Grokking of Hierarchical Structure in Vanilla Transformers
Shikhar Murty
Pratyusha Sharma
Jacob Andreas
Christopher D. Manning
73
47
0
30 May 2023
Language Models Implement Simple Word2Vec-style Vector Arithmetic
Language Models Implement Simple Word2Vec-style Vector Arithmetic
Jack Merullo
Carsten Eickhoff
Ellie Pavlick
KELM
59
65
0
25 May 2023
Large Language Models are In-Context Semantic Reasoners rather than
  Symbolic Reasoners
Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners
Xiaojuan Tang
Zilong Zheng
Jiaqi Li
Fanxu Meng
Song-Chun Zhu
Yitao Liang
Muhan Zhang
ReLM
LRM
70
61
0
24 May 2023
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca
Zhengxuan Wu
Atticus Geiger
Thomas Icard
Christopher Potts
Noah D. Goodman
MILM
70
92
0
15 May 2023
RECKONING: Reasoning through Dynamic Knowledge Encoding
RECKONING: Reasoning through Dynamic Knowledge Encoding
Zeming Chen
Gail Weiss
E. Mitchell
Asli Celikyilmaz
Antoine Bosselut
KELM
LRM
61
13
0
10 May 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language
  Models
Dissecting Recall of Factual Associations in Auto-Regressive Language Models
Mor Geva
Jasmijn Bastings
Katja Filippova
Amir Globerson
KELM
240
313
0
28 Apr 2023
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
1.5K
13,167
0
27 Feb 2023
Progress measures for grokking via mechanistic interpretability
Progress measures for grokking via mechanistic interpretability
Neel Nanda
Lawrence Chan
Tom Lieberum
Jess Smith
Jacob Steinhardt
73
435
0
12 Jan 2023
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
504
4,409
0
24 May 2022
Selection-Inference: Exploiting Large Language Models for Interpretable
  Logical Reasoning
Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning
Antonia Creswell
Murray Shanahan
I. Higgins
ReLM
LRM
92
361
0
19 May 2022
Do Transformer Models Show Similar Attention Patterns to Task-Specific
  Human Gaze?
Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?
Stephanie Brandl
Oliver Eberle
Jonas Pilot
Anders Søgaard
92
36
0
25 Apr 2022
Probing for the Usage of Grammatical Number
Probing for the Usage of Grammatical Number
Karim Lasri
Tiago Pimentel
Alessandro Lenci
Thierry Poibeau
Ryan Cotterell
54
58
0
19 Apr 2022
Rethinking Attention-Model Explainability through Faithfulness Violation
  Test
Rethinking Attention-Model Explainability through Faithfulness Violation Test
Yebin Liu
Haoliang Li
Yangyang Guo
Chen Kong
Jing Li
Shiqi Wang
FAtt
140
43
0
28 Jan 2022
Attention Flows are Shapley Value Explanations
Attention Flows are Shapley Value Explanations
Kawin Ethayarajh
Dan Jurafsky
FAtt
TDI
56
35
0
31 May 2021
Bird's Eye: Probing for Linguistic Graph Structures with a Simple
  Information-Theoretic Approach
Bird's Eye: Probing for Linguistic Graph Structures with a Simple Information-Theoretic Approach
Buse Giledereli
Mrinmaya Sachan
41
10
0
06 May 2021
Generic Attention-model Explainability for Interpreting Bi-Modal and
  Encoder-Decoder Transformers
Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers
Hila Chefer
Shir Gur
Lior Wolf
ViT
60
318
0
29 Mar 2021
Probing Classifiers: Promises, Shortcomings, and Advances
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
266
443
0
24 Feb 2021
On-the-Fly Attention Modulation for Neural Generation
On-the-Fly Attention Modulation for Neural Generation
Yue Dong
Chandra Bhagavatula
Ximing Lu
Jena D. Hwang
Antoine Bosselut
Jackie C.K. Cheung
Yejin Choi
88
13
0
02 Jan 2021
ProofWriter: Generating Implications, Proofs, and Abductive Statements
  over Natural Language
ProofWriter: Generating Implications, Proofs, and Abductive Statements over Natural Language
Oyvind Tafjord
Bhavana Dalvi
Peter Clark
70
273
0
24 Dec 2020
Attention Flows: Analyzing and Comparing Attention Mechanisms in
  Language Models
Attention Flows: Analyzing and Comparing Attention Mechanisms in Language Models
Joseph F DeRose
Jiayao Wang
M. Berger
35
84
0
03 Sep 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
743
41,894
0
28 May 2020
Quantifying Attention Flow in Transformers
Quantifying Attention Flow in Transformers
Samira Abnar
Willem H. Zuidema
149
795
0
02 May 2020
Probing the Probing Paradigm: Does Probing Accuracy Entail Task
  Relevance?
Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance?
Abhilasha Ravichander
Yonatan Belinkov
Eduard H. Hovy
61
128
0
02 May 2020
A Primer in BERTology: What we know about how BERT works
A Primer in BERTology: What we know about how BERT works
Anna Rogers
Olga Kovaleva
Anna Rumshisky
OffRL
84
1,496
0
27 Feb 2020
Designing and Interpreting Probes with Control Tasks
Designing and Interpreting Probes with Control Tasks
John Hewitt
Percy Liang
58
536
0
08 Sep 2019
On Identifiability in Transformers
On Identifiability in Transformers
Gino Brunner
Yang Liu
Damian Pascual
Oliver Richter
Massimiliano Ciaramita
Roger Wattenhofer
ViT
63
188
0
12 Aug 2019
Analyzing the Structure of Attention in a Transformer Language Model
Analyzing the Structure of Attention in a Transformer Language Model
Jesse Vig
Yonatan Belinkov
66
367
0
07 Jun 2019
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy
  Lifting, the Rest Can Be Pruned
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Elena Voita
David Talbot
F. Moiseev
Rico Sennrich
Ivan Titov
106
1,139
0
23 May 2019
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning
  Challenge
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
Peter Clark
Isaac Cowhey
Oren Etzioni
Tushar Khot
Ashish Sabharwal
Carissa Schoenick
Oyvind Tafjord
ELM
RALM
LRM
158
2,583
0
14 Mar 2018
1