17
5

Quantifying and estimating dependence via sensitivity of conditional distributions

Abstract

Recently established, directed dependence measures for pairs (X,Y)(X,Y) of random variables build upon the natural idea of comparing the conditional distributions of YY given X=xX=x with the marginal distribution of YY. They assign pairs (X,Y)(X,Y) values in [0,1][0,1], the value is 00 if and only if X,YX,Y are independent, and it is 11 exclusively for YY being a function of XX. Here we show that comparing randomly drawn conditional distributions with each other instead or, equivalently, analyzing how sensitive the conditional distribution of YY given X=xX=x is on xx, opens the door to constructing novel families of dependence measures Λφ\Lambda_\varphi induced by general convex functions φ:RR\varphi: \mathbb{R} \rightarrow \mathbb{R}, containing, e.g., Chatterjee's coefficient of correlation as special case. After establishing additional useful properties of Λφ\Lambda_\varphi we focus on continuous (X,Y)(X,Y), translate Λφ\Lambda_\varphi to the copula setting, consider the LpL^p-version and establish an estimator which is strongly consistent in full generality. A real data example and a simulation study illustrate the chosen approach and the performance of the estimator. Complementing the afore-mentioned results, we show how a slight modification of the construction underlying Λφ\Lambda_\varphi can be used to define new measures of explainability generalizing the fraction of explained variance.

View on arXiv
Comments on this paper