Parallel and distributed Bayesian modelling for analysing
high-dimensional spatio-temporal count data
This paper proposes a general procedure to analyse high-dimensional spatio-temporal count data, with special emphasis on relative risks estimation in cancer epidemiology. More precisely, we present a pragmatic and simple idea that permits to fit hierarchical spatio-temporal models when the number of small areas is very large. Model fitting is carried out using integrated nested Laplace approximations over a partition of the spatial domain. We also use parallel and distributed strategies to speed up computations in a setting where Bayesian model fitting is generally prohibitively time-consuming and even unfeasible. The whole procedure is evaluated in a simulation study with a twofold objective: to estimate risks accurately and to detect extreme risk areas while avoiding false positives/negatives. We show that our method outperforms classical global models. A real data analysis comparing the global models and the new procedure is also presented.
View on arXiv