Supersparse Linear Integer Models for Interpretable Classification

27 June 2013

Abstract

Scoring systems are classification models that only require users to add, subtract and multiply a few meaningful numbers to generate a prediction. These systems are often used because they are practical and interpretable. In this paper, we introduce Supersparse Linear Integer Models (SLIM) as an off-the-shelf tool to create scoring systems that are both highly accurate and highly interpretable. SLIM is formulated as a discrete optimization problem, which minimizes the 0-1 loss to encourage a high level of accuracy, regularizes the L0-norm to encourage a high level of sparsity, and uses additional constraints to further restrict coefficients to meaningful and intuitive values. We illustrate the practical and interpretable nature of SLIM scoring systems by presenting applications in medicine and criminology. In addition, we show that SLIM scoring systems are accurate and sparse in comparison to state-of-the-art classification models using a series of numerical experiments.

View on arXiv

Comments on this paper