Causal Feature Learning in the Social Sciences

Abstract
Variable selection poses a significant challenge in causal modeling, particularly within the social sciences, where constructs often rely on inter-related factors such as age, socioeconomic status, gender, and race. Indeed, it has been argued that such attributes must be modeled as macro-level abstractions of lower-level manipulable features, in order to preserve the modularity assumption essential to causal inference. This paper accordingly extends the theoretical framework of Causal Feature Learning (CFL). Empirically, we apply the CFL algorithm to diverse social science datasets, evaluating how CFL-derived macrostates compare with traditional microstates in downstream modeling tasks.
View on arXiv@article{huang2025_2503.12784, title={ Causal Feature Learning in the Social Sciences }, author={ Jingzhou Huang and Jiuyao Lu and Alexander Williams Tolbert }, journal={arXiv preprint arXiv:2503.12784}, year={ 2025 } }
Comments on this paper