Conjecturing-Based Computational Discovery of Patterns in Data

Abstract
We propose the use of a conjecturing machine that generates feature relationships in the form of bounds involving nonlinear terms for numerical features and boolean expressions for categorical features. The proposed \textsc{Conjecturing} framework recovers known nonlinear and boolean relationships among features from data. In both settings, true underlying relationships are revealed. We then compare the method to a previously-proposed framework for symbolic regression and demonstrate that it can also be used to recover equations that are satisfied among features in a dataset. The framework is then applied to patient-level data regarding COVID-19 outcomes to suggest possible risk factors that are confirmed in medical literature.
View on arXivComments on this paper