Learning Small Decision Trees with Few Outliers: A Parameterized Perspective

Decision trees are a fundamental tool in machine learning for representing, classifying, and generalizing data. It is desirable to construct ``small'' decision trees, by minimizing either the \textit{size} () or the \textit{depth} of the \textit{decision tree} (\textsc{DT}). Recently, the parameterized complexity of \textsc{Decision Tree Learning} has attracted a lot of attention. We consider a generalization of \textsc{Decision Tree Learning} where given a \textit{classification instance} and an integer , the task is to find a ``small'' \textsc{DT} that disagrees with in at most examples. We consider two problems: \textsc{DTSO} and \textsc{DTDO}, where the goal is to construct a \textsc{DT} minimizing and , respectively. We first establish that both \textsc{DTSO} and \textsc{DTDO} are W[1]-hard when parameterized by and , respectively, where is the maximum number of features in which two differently labeled examples can differ. We complement this result by showing that these problems become \textsc{FPT} if we include the parameter . We also consider the kernelization complexity of these problems and establish several positive and negative results for both \textsc{DTSO} and \textsc{DTDO}.
View on arXiv@article{gahlawat2025_2505.15648, title={ Learning Small Decision Trees with Few Outliers: A Parameterized Perspective }, author={ Harmender Gahlawat and Meirav Zehavi }, journal={arXiv preprint arXiv:2505.15648}, year={ 2025 } }