Efficiently Learning and Sampling Interventional Distributions from Observations
- CML

We study the problem of efficiently estimating the effect of an intervention on a single variable using observational samples in a causal Bayesian network. Our goal is to give algorithms that are efficient in both time and sample complexity in a non-parametric setting. Tian and Pearl (AAAI `02) have exactly characterized the class of causal graphs for which causal effects of atomic interventions can be identified from observational data. We make their result quantitative. Suppose P is a causal model on a set V of n observable variables with respect to a given causal graph G with observable distribution . Let denote the interventional distribution over the observables with respect to an intervention of a designated variable X with x. We show that assuming that G has bounded in-degree, bounded c-components, and that the observational distribution is identifiable and satisfies certain strong positivity condition: 1. [Evaluation] There is an algorithm that outputs with probability an evaluator for a distribution that satisfies using samples from and time. The evaluator can return in time the probability for any assignment to . 2. [Generation] There is an algorithm that outputs with probability a sampler for a distribution that satisfies using samples from and time. The sampler returns an iid sample from with probability in time. We extend our techniques to estimate marginals over a given of interest. We also show lower bounds for the sample complexity showing that our sample complexity has optimal dependence on the parameters n and as well as the strong positivity parameter.
View on arXiv