Explainability Via Causal Self-Talk

Explainability Via Causal Self-Talk

17 November 2022

Nicholas A. Roy

Neil C. Rabinowitz

Papers citing "Explainability Via Causal Self-Talk"

8 / 8 papers shown

Title
HackAtari: Atari Learning Environments for Robust and Continual Reinforcement Learning Quentin Delfosse Jannis Blüml Bjarne Gregori Kristian Kersting 31 7 0 06 Jun 2024
Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning Hector Kohler Quentin Delfosse R. Akrour Kristian Kersting Philippe Preux 62 14 0 23 May 2024
Towards a Research Community in Interpretable Reinforcement Learning: the InterpPol Workshop Hector Kohler Quentin Delfosse Paul Festor Philippe Preux 32 0 0 16 Apr 2024
Interpretable Concept Bottlenecks to Align Reinforcement Learning Agents Quentin Delfosse Sebastian Sztwiertnia M. Rothermel Wolfgang Stammer Kristian Kersting 52 18 0 11 Jan 2024
Learning by Self-Explaining Wolfgang Stammer Felix Friedrich David Steinmann Manuel Brack Hikaru Shindo Kristian Kersting 24 7 0 15 Sep 2023
In-context Learning and Induction Heads Catherine Olsson Nelson Elhage Neel Nanda Nicholas Joseph Nova Dassarma ... Tom B. Brown Jack Clark Jared Kaplan Sam McCandlish C. Olah 250 460 0 24 Sep 2022
PonderNet: Learning to Ponder Andrea Banino Jan Balaguer Charles Blundell PINN AIMat 96 80 0 12 Jul 2021
Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI Alon Jacovi Ana Marasović Tim Miller Yoav Goldberg 252 426 0 15 Oct 2020