Know-MRI: A Knowledge Mechanisms Revealer&Interpreter for Large Language Models

10 June 2025

Abstract

As large language models (LLMs) continue to advance, there is a growing urgency to enhance the interpretability of their internal knowledge mechanisms. Consequently, many interpretation methods have emerged, aiming to unravel the knowledge mechanisms of LLMs from various perspectives. However, current interpretation methods differ in input data formats and interpreting outputs. The tools integrating these methods are only capable of supporting tasks with specific inputs, significantly constraining their practical applications. To address these challenges, we present an open-source Knowledge Mechanisms Revealer&Interpreter (Know-MRI) designed to analyze the knowledge mechanisms within LLMs systematically. Specifically, we have developed an extensible core module that can automatically match different input data with interpretation methods and consolidate the interpreting outputs. It enables users to freely choose appropriate interpretation methods based on the inputs, making it easier to comprehensively diagnose the model's internal knowledge mechanisms from multiple perspectives. Our code is available atthis https URL. We also provide a demonstration video onthis https URL.

View on arXiv

@article{liu2025_2506.08427,
  title={ Know-MRI: A Knowledge Mechanisms Revealer&Interpreter for Large Language Models },
  author={ Jiaxiang Liu and Boxuan Xing and Chenhao Yuan and Chenxiang Zhang and Di Wu and Xiusheng Huang and Haida Yu and Chuhan Lang and Pengfei Cao and Jun Zhao and Kang Liu },
  journal={arXiv preprint arXiv:2506.08427},
  year={ 2025 }
}

Main:6 Pages

10 Figures

Bibliography:4 Pages

4 Tables

Appendix:2 Pages

Comments on this paper