Learning Small Decision Trees with Few Outliers: A Parameterized Perspective

21 May 2025

Abstract

Decision trees are a fundamental tool in machine learning for representing, classifying, and generalizing data. It is desirable to construct ``small'' decision trees, by minimizing either the \textit{size} ( $s$ ) or the \textit{depth} $(d)$ of the \textit{decision tree} (\textsc{DT}). Recently, the parameterized complexity of \textsc{Decision Tree Learning} has attracted a lot of attention. We consider a generalization of \textsc{Decision Tree Learning} where given a \textit{classification instance} $E$ and an integer $t$ , the task is to find a ``small'' \textsc{DT} that disagrees with $E$ in at most $t$ examples. We consider two problems: \textsc{DTSO} and \textsc{DTDO}, where the goal is to construct a \textsc{DT} minimizing $s$ and $d$ , respectively. We first establish that both \textsc{DTSO} and \textsc{DTDO} are W[1]-hard when parameterized by $s+\delta_{max}$ and $d+\delta_{max}$ , respectively, where $\delta_{max}$ is the maximum number of features in which two differently labeled examples can differ. We complement this result by showing that these problems become \textsc{FPT} if we include the parameter $t$ . We also consider the kernelization complexity of these problems and establish several positive and negative results for both \textsc{DTSO} and \textsc{DTDO}.

View on arXiv

@article{gahlawat2025_2505.15648,
  title={ Learning Small Decision Trees with Few Outliers: A Parameterized Perspective },
  author={ Harmender Gahlawat and Meirav Zehavi },
  journal={arXiv preprint arXiv:2505.15648},
  year={ 2025 }
}

Comments on this paper