215

Interpreto: An Explainability Library for Transformers

Antonin Poché
Thomas Mullor
Gabriele Sarti
Frédéric Boisnard
Corentin Friedrich
Charlotte Claye
François Hoofd
Raphael Bernas
Céline Hudelot
Fanny Jourdan
Main:5 Pages
5 Figures
Bibliography:4 Pages
1 Tables
Abstract

Interpreto is a Python library for post-hoc explainability of text HuggingFace models, from early BERT variants to LLMs. It provides two complementary families of methods: attributions and concept-based explanations. The library connects recent research to practical tooling for data scientists, aiming to make explanations accessible to end users. It includes documentation, examples, and tutorials.Interpreto supports both classification and generation models through a unified API. A key differentiator is its concept-based functionality, which goes beyond feature-level attributions and is uncommon in existing libraries.The library is open source; install via pip install interpreto. Code and documentation are available atthis https URL.

View on arXiv
Comments on this paper