22
24

Feature Selection for Regression Problems Based on the Morisita Estimator of Intrinsic Dimension: Concept and Case Studies

Abstract

Data acquisition, storage and management have been improved, while the key factors of many phenomena are not well known. Consequently, irrelevant and redundant features artificially increase the size of datasets, which complicates learning tasks, such as regression. To address this problem, feature selection methods have been proposed. This research introduces a new supervised filter based on the Morisita estimator of intrinsic dimension. It is able to identify relevant features and to distinguish between redundant and irrelevant information. Besides, it does not rely on arbitrary parameters and it can be easily implemented in any programming environment. The suggested algorithm is applied to both synthetic and real data and a comparison with RReliefF is conducted using extreme learning machine.

View on arXiv
Comments on this paper