Closed-form cdf and pdf of Tukey's h-distribution: The Lambert Way to "Gaussianize'' skewed, heavy-tailed data

Recently Goerg (2010) introduced Lambert W F random variables (RVs), a new family of generalized skewed distributions. Here I adapt this framework to generate heavy-tailed versions of arbitrary distributions. As in the skewed case a non-linear, parametric transformation of an input RV with arbitrary cumulative distribution function (cdf) yields a heavy-tailed version . The tail behavior depends on a tail parameter ; for , , for has heavier tails than . It turns out that heavy-tail Lambert W Gaussian RVs equal heavy-tailed Tukey h RVs (the family with ). The Lambert W framework yields an explicit inverse of the transformation, and thus analytical, concise and simple expressions for the cdf and pdf for Tukey's distribution - to the authors knowledge the first time in the literature. Furthermore, the Lambert W approach gives applied researchers the tool to ``Gaussianize'' their skewed, heavy-tailed data and apply common methods and models on the so obtained Gaussian data. The optimal parameters to Gaussianize can be estimated by maximum likelihood (ML). %Contrary to the skewed case, the transformation is bijective: each observed data point is uniquely linked to its hidden (and normally tailed) input. A modular toolkit to analyze data using the proposed methods will soon be added to the \href{cran.r-project.org/web/packages/LambertW}{\texttt{LambertW}} package, originally implemented for the skew Lambert W case.
View on arXiv