Causal Inference by Identification of Vector Autoregressive Processes with Hidden Components

14 November 2014

Philipp Geiger

Abstract

A widely applied approach to causal inference from a time series $X$ , often referred to as "(linear) Granger causal analysis", is to simply regress present on past and interpret the regression matrix $\hat{B}$ causally. However, if there is an unmeasured time series $Z$ that influences $X$ , then this approach can lead to wrong causal conclusions, i.e., distinct from those one would draw if one had additional information such as $Z$ . In this paper we take a different approach: We assume that $X$ together with some unmeasured $Z$ forms a vector autoregressive (VAR) process with transition matrix $A$ , and argue why it is more valid to interpret $A$ causally instead of $\hat{B}$ . Then we examine under which conditions the most important parts of $A$ are identifiable or almost identifiable from only $X$ . Essentially, sufficient conditions are (1) non-Gaussian, independent noise or (2) no influence from $X$ to $Z$ . We present two estimation algorithms that are tailored towards conditions (1) and (2), respectively, and evaluate them on synthetic and real-world data. We discuss how to check certain model assumptions using $X$ .

View on arXiv

Comments on this paper