Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs

Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs

14 May 2025

Hamidreza Saghir

ArXiv (abs)PDF HTML

Papers citing "Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs"

10 / 10 papers shown

Title
Finding Transformer Circuits with Edge Pruning Adithya Bhaskar Alexander Wettig Dan Friedman Danqi Chen 208 20 0 24 Jun 2024
The Power of Noise: Redefining Retrieval for RAG Systems Florin Cuconasu Giovanni Trappolini F. Siciliano Simone Filice Cesare Campagnano Y. Maarek Nicola Tonellotto Fabrizio Silvestri RALM 120 180 0 26 Jan 2024
Linearity of Relation Decoding in Transformer Language Models Evan Hernandez Arnab Sen Sharma Tal Haklay Kevin Meng Martin Wattenberg Jacob Andreas Yonatan Belinkov David Bau KELM 82 100 0 17 Aug 2023
Tracr: Compiled Transformers as a Laboratory for Interpretability David Lindner János Kramár Sebastian Farquhar Matthew Rahtz Tom McGrath Vladimir Mikulik 117 75 0 12 Jan 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small Kevin Wang Alexandre Variengien Arthur Conmy Buck Shlegeris Jacob Steinhardt 318 563 0 01 Nov 2022
In-context Learning and Induction Heads Catherine Olsson Nelson Elhage Neel Nanda Nicholas Joseph Nova Dassarma ... Tom B. Brown Jack Clark Jared Kaplan Sam McCandlish C. Olah 323 528 0 24 Sep 2022
Locating and Editing Factual Associations in GPT Kevin Meng David Bau A. Andonian Yonatan Belinkov KELM 253 1,390 0 10 Feb 2022
Measuring and Improving Consistency in Pretrained Language Models Yanai Elazar Nora Kassner Shauli Ravfogel Abhilasha Ravichander Eduard H. Hovy Hinrich Schütze Yoav Goldberg HILM 331 370 0 01 Feb 2021
Language Models are Few-Shot Learners Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan ... Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever Dario Amodei BDL 908 42,520 0 28 May 2020
Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation Yoshua Bengio Nicholas Léonard Aaron Courville 403 3,158 0 15 Aug 2013