583

Estimation and Inference about Conditional Average Treatment Effect and Other Structural Functions

Abstract

Our framework can be viewed as inference on low-dimensional nonparametric functions in the presence of high-dimensional nuisance function (where dimensionality refers to the number of covariates). Specifically, we consider the setting where we have a signal Y=Y(η0)Y=Y(\eta_0) that is an unbiased predictor of causal/structural objects like treatment effect, structural derivative, outcome given treatment, and others, conditional on a set of very high dimensional controls ZZ. We are interested in simpler lower-dimensional nonparametric summaries of YY, namely g(x)=E[YX=x]g(x)=E[Y|X=x] conditional on a low-dimensional subset of covariates XX. The signal Y=Y(η)Y=Y(\eta) depends on an unknown nuisance function η0(Z)\eta_0(Z). In the first stage, we need to learn the function η0(Z)\eta_0(Z) using any machine learning method that is able to approximate η\eta accurately under very high dimensionality of ZZ. For example, under approximate sparsity with respect to a dictionary, 1\ell_1-penalized methods can be used; in others, tools such as deep neural networks can be used. To make the subsequent inference valid, we make the signal orthogonal to perturbations of η\eta. As a result, the second-stage low-dimensional nonparametric inference enjoys the quasi-oracle properties, as if we knew η0\eta_0. In the second stage, we approximate the target function g(x)g(x) by a linear form p(x)β0p(x)'\beta_0, where β0\beta_0 is the Best Linear Predictor parameter. We develop a complete set of results about estimation and approximately Gaussian inference on xp(x)βx \mapsto p(x)'\beta and xg(x)x \mapsto g(x). If p(x)p(x) is sufficiently rich and g(x)g(x) admits a good approximation, then g(x)g(x) gets automatically targeted by the inference; otherwise, the best linear approximation p(x)βp(x)'\beta to g(x)g(x) gets targeted. When p(x)p(x) is specified as a collection of group indicators, p(x)βp(x)'\beta describes group-average treatment effects (GATEs).

View on arXiv
Comments on this paper