Emergent World Models and Latent Variable Estimation in Chess-Playing Language Models

21 March 2024

Papers citing "Emergent World Models and Latent Variable Estimation in Chess-Playing Language Models"

16 / 16 papers shown

Title
Improving Reasoning Performance in Large Language Models via Representation Engineering Bertram Højer Oliver Jarvis Stefan Heinrich LRM 83 1 0 28 Apr 2025
Revisiting the Othello World Model Hypothesis Yifei Yuan Anders Søgaard LRM 60 0 0 06 Mar 2025
Mixture of Experts Made Intrinsically Interpretable Xingyi Yang Constantin Venhoff Ashkan Khakzar Christian Schroeder de Witt P. Dokania Adel Bibi Philip H. S. Torr MoE 49 0 0 05 Mar 2025
(How) Do Language Models Track State? Belinda Z. Li Zifan Carl Guo Jacob Andreas LRM 46 0 0 04 Mar 2025
Implicit Search via Discrete Diffusion: A Study on Chess Jiacheng Ye Zhenyu Wu Jiahui Gao Zhiyong Wu Xin Jiang Z. Li Lingpeng Kong DiffM 50 2 0 27 Feb 2025
Transformers Use Causal World Models in Maze-Solving Tasks Alex F Spies William Edwards Michael I. Ivanitskiy Adrians Skapars Tilman Rauker Katsumi Inoue A. Russo Murray Shanahan 128 1 0 16 Dec 2024
COLD: Causal reasOning in cLosed Daily activities Abhinav Joshi A. Ahmad Ashutosh Modi LRM ReLM 74 1 0 29 Nov 2024
Human-aligned Chess with a Bit of Search Yiming Zhang Athul Paul Jacob Vivian Lai Daniel Fried Daphne Ippolito 26 1 0 04 Oct 2024
On Logical Extrapolation for Mazes with Recurrent and Implicit Networks Brandon Knutson Amandin Chyba Rabeendran Michael I. Ivanitskiy Jordan Pettyjohn Cecilia G. Diniz Behn Samy Wu Fung Daniel McKenzie LRM 44 2 0 03 Oct 2024
Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models Adam Karvonen Benjamin Wright Can Rager Rico Angell Jannik Brinkmann Logan Smith C. M. Verdun David Bau Samuel Marks 38 26 0 31 Jul 2024
Unlocking the Future: Exploring Look-Ahead Planning Mechanistic Interpretability in Large Language Models Tianyi Men Pengfei Cao Zhuoran Jin Yubo Chen Kang Liu Jun Zhao LLMAG AIFin 25 4 0 23 Jun 2024
Transcendence: Generative Models Can Outperform The Experts That Train Them Edwin Zhang Vincent Zhu Naomi Saphra Anat Kleiman Benjamin L. Edelman Milind Tambe Sham Kakade Eran Malach 27 10 0 17 Jun 2024
Evidence of Learned Look-Ahead in a Chess-Playing Neural Network Erik Jenner Shreyas Kapur Vasil Georgiev Cameron Allen Scott Emmons Stuart J. Russell 37 10 0 02 Jun 2024
Controlling Large Language Model Agents with Entropic Activation Steering Nate Rahn P. DÓro Marc G. Bellemare LLMSV 30 6 0 01 Jun 2024
SIP: Injecting a Structural Inductive Bias into a Seq2Seq Model by Simulation Matthias Lindemann Alexander Koller Ivan Titov AI4CE 19 1 0 01 Oct 2023
Probing Classifiers: Promises, Shortcomings, and Advances Yonatan Belinkov 226 405 0 24 Feb 2021