Binary hidden Markov models and varieties

3 June 2012

Abstract

The technological applications of hidden Markov models have been extremely diverse and successful, including natural language processing, gesture recognition, gene sequencing, and Kalman filtering of physical measurements. HMMs are highly non-linear statistical models, and just as linear models are amenable to linear algebraic techniques, non-linear models are amenable to commutative algebra and algebraic geometry. This paper examines closely those HMMs in which all the random variables, called nodes, are binary. Its main contributions are (1) minimal defining equations for the 4-node model, comprising 21 quadrics and 29 cubics, which were computed using Gr\"obner bases in the cumulant coordinates of Sturmfels and Zwiernik, and (2) a birational parametrization for every binary HMM, with an explicit inverse for recovering the hidden parameters in terms of observables. The new model parameters in (2) are hence rationally identifiable in the sense of Sullivant, Garcia-Puente, and Spielvogel, and each model's Zariski closure is therefore a rational projective variety of dimension 5. Gr\"obner basis computations for the model and its graph are found to be considerably faster using these parameters. Together, (1) and (2) provide a nearly instantaneous computational test for whether an observed probability distribution is due to a binary hidden Markov process, in comparison with a less specialized algorithm of Sch\"onhuth involving matrix row reduction. Defining equations such as (1) have been used successfully in model selection problems in phylogenetics, and one can hope for similar applications in the case of HMMs.

View on arXiv

Comments on this paper