7
4

Learning Small Decision Trees with Few Outliers: A Parameterized Perspective

Abstract

Decision trees are a fundamental tool in machine learning for representing, classifying, and generalizing data. It is desirable to construct ``small'' decision trees, by minimizing either the \textit{size} (ss) or the \textit{depth} (d)(d) of the \textit{decision tree} (\textsc{DT}). Recently, the parameterized complexity of \textsc{Decision Tree Learning} has attracted a lot of attention. We consider a generalization of \textsc{Decision Tree Learning} where given a \textit{classification instance} EE and an integer tt, the task is to find a ``small'' \textsc{DT} that disagrees with EE in at most tt examples. We consider two problems: \textsc{DTSO} and \textsc{DTDO}, where the goal is to construct a \textsc{DT} minimizing ss and dd, respectively. We first establish that both \textsc{DTSO} and \textsc{DTDO} are W[1]-hard when parameterized by s+δmaxs+\delta_{max} and d+δmaxd+\delta_{max}, respectively, where δmax\delta_{max} is the maximum number of features in which two differently labeled examples can differ. We complement this result by showing that these problems become \textsc{FPT} if we include the parameter tt. We also consider the kernelization complexity of these problems and establish several positive and negative results for both \textsc{DTSO} and \textsc{DTDO}.

View on arXiv
@article{gahlawat2025_2505.15648,
  title={ Learning Small Decision Trees with Few Outliers: A Parameterized Perspective },
  author={ Harmender Gahlawat and Meirav Zehavi },
  journal={arXiv preprint arXiv:2505.15648},
  year={ 2025 }
}
Comments on this paper