58
6

The Lambert Way to Gaussianize skewed, heavy tailed data with the inverse of Tukey's h transformation as a special case

Abstract

In this work I follow the same principle as in Goerg (2011) and introduce a parametric, bijective transformation to generate heavy-tail versions YY of an arbitrary random variable (RV) XFXX \sim F_X. The tail behavior of the heavy-tail Lambert W ×\times FXF_X RV YY depends on a tail parameter δ0\delta \geq 0 ; for δ=0\delta = 0, Y=XY = X, for δ>0\delta > 0 YY has heavier tails than XX. For XX being Gaussian, this new meta-famliy of heavy-tailed distributions reduces to Tukey's hh distribution. The Lambert W framework yields an explicit inverse and thus analytical, concise and simple expressions for the cumulative distribution (cdf) GY(y)G_Y(y) and probability density function (pdf) gY(y)g_Y(y), which are functions of FX(x)F_X(x) and fX(x)f_X(x) and Lambert's W function. As a special case, Tukey's hh pdf and cdf become available - to the authors knowledge for the first time in the literature. Furthermore, the Lambert W approach allows researchers to "Gaussianize" skewed, heavy-tailed data and apply common methods and models on the so obtained Gaussian data. The optimal parameters to Gaussianize can be estimated by maximum likelihood (ML). An illustration on a simulated Cauchy sample as well as S&P 500 log-returns demonstrate the power of this new family of heavy-tailed distributions: in both cases the back-transformed data is indistinguishable from a Gaussian sample. The R package "LambertW" (cran.r-project.org/web/packages/LambertW) contains the methods presented here to perform an adequate empirical analysis and is publicly available from CRAN

View on arXiv
Comments on this paper