ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.12161
  4. Cited By
Interpretability in Action: Exploratory Analysis of VPT, a Minecraft
  Agent

Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent

16 July 2024
Karolis Jucys
George Adamopoulos
Mehrab Hamidi
Stephanie Milani
Mohammad Reza Samsami
Artem Zholus
Sonia Joseph
Blake A. Richards
Irina Rish
Özgür Simsek
ArXiv (abs)PDFHTML

Papers citing "Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent"

13 / 13 papers shown
Title
How to use and interpret activation patching
How to use and interpret activation patching
Stefan Heimersheim
Neel Nanda
73
47
0
23 Apr 2024
Explaining Reinforcement Learning with Shapley Values
Explaining Reinforcement Learning with Shapley Values
Daniel Beechey
Thomas M. S. Smith
Özgür Simsek
TDIFAtt
50
18
0
09 Jun 2023
Towards Automated Circuit Discovery for Mechanistic Interpretability
Towards Automated Circuit Discovery for Mechanistic Interpretability
Arthur Conmy
Augustine N. Mavor-Parker
Aengus Lynch
Stefan Heimersheim
Adrià Garriga-Alonso
64
318
0
28 Apr 2023
Interpretability in the Wild: a Circuit for Indirect Object
  Identification in GPT-2 small
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
310
559
0
01 Nov 2022
On Feature Learning in the Presence of Spurious Correlations
On Feature Learning in the Presence of Spurious Correlations
Pavel Izmailov
Polina Kirichenko
Nate Gruver
A. Wilson
103
129
0
20 Oct 2022
In-context Learning and Induction Heads
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
319
525
0
24 Sep 2022
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online
  Videos
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
Bowen Baker
Ilge Akkaya
Peter Zhokhov
Joost Huizinga
Jie Tang
Adrien Ecoffet
Brandon Houghton
Raul Sampedro
Jeff Clune
OffRL
130
303
0
23 Jun 2022
Underspecification Presents Challenges for Credibility in Modern Machine
  Learning
Underspecification Presents Challenges for Credibility in Modern Machine Learning
Alexander DÁmour
Katherine A. Heller
D. Moldovan
Ben Adlam
B. Alipanahi
...
Kellie Webster
Steve Yadlowsky
T. Yun
Xiaohua Zhai
D. Sculley
OffRL
120
688
0
06 Nov 2020
Explainable Reinforcement Learning: A Survey
Explainable Reinforcement Learning: A Survey
Erika Puiutta
Eric M. S. P. Veith
XAI
71
248
0
13 May 2020
MineRL: A Large-Scale Dataset of Minecraft Demonstrations
MineRL: A Large-Scale Dataset of Minecraft Demonstrations
William H. Guss
Brandon Houghton
Nicholay Topin
Phillip Wang
Cayden R. Codel
Manuela Veloso
Ruslan Salakhutdinov
OffRL
68
227
0
29 Jul 2019
Sanity Checks for Saliency Maps
Sanity Checks for Saliency Maps
Julius Adebayo
Justin Gilmer
M. Muelly
Ian Goodfellow
Moritz Hardt
Been Kim
FAttAAMLXAI
141
1,969
0
08 Oct 2018
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
Marco Tulio Ribeiro
Sameer Singh
Carlos Guestrin
FAttFaML
1.2K
17,027
0
16 Feb 2016
Deep Inside Convolutional Networks: Visualising Image Classification
  Models and Saliency Maps
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
Karen Simonyan
Andrea Vedaldi
Andrew Zisserman
FAtt
314
7,316
0
20 Dec 2013
1