The expectation–maximization (EM) algorithm is a familiar tool for computing the maximum likelihood estimate of the parameters in hidden Markov and semi-Markov models. This paper carries out a detailed study on the influence that the initial values of the parameters impose on the results produced by the algorithm. We compare random starts and partitional and model-based strategies for choosing the initial values for the EM algorithm in the case of multivariate Gaussian emission distributions (EDs) and assess the performance of each strategy with different assessment criteria. Several data generation settings are considered with varying number of latent states, of variables as well as of the level of fuzziness in the data, and discussion on how each factor influences the obtained results is provided. Simulation results show that different initialization strategies may lead to different log-likelihood values and, accordingly, to different estimated partitions. A clear indication of which strategies should be preferred is given. We further include two real-data examples, widely analysed in the hidden semi-Markov model literature.

Initialization of Hidden Markov and Semi-Markov Models: A Critical Evaluation of Several Strategies

Punzo A.
2021-01-01

Abstract

The expectation–maximization (EM) algorithm is a familiar tool for computing the maximum likelihood estimate of the parameters in hidden Markov and semi-Markov models. This paper carries out a detailed study on the influence that the initial values of the parameters impose on the results produced by the algorithm. We compare random starts and partitional and model-based strategies for choosing the initial values for the EM algorithm in the case of multivariate Gaussian emission distributions (EDs) and assess the performance of each strategy with different assessment criteria. Several data generation settings are considered with varying number of latent states, of variables as well as of the level of fuzziness in the data, and discussion on how each factor influences the obtained results is provided. Simulation results show that different initialization strategies may lead to different log-likelihood values and, accordingly, to different estimated partitions. A clear indication of which strategies should be preferred is given. We further include two real-data examples, widely analysed in the hidden semi-Markov model literature.
2021
EM algorithm
hidden Markov models
hidden semi-Markov models
initialization
simulation
File in questo prodotto:
File Dimensione Formato  
Maruotti & Punzo (2021) - ISR.pdf

solo gestori archivio

Descrizione: Articolo principale
Tipologia: Versione Editoriale (PDF)
Dimensione 9.54 MB
Formato Adobe PDF
9.54 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/523708
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 33
  • ???jsp.display-item.citation.isi??? 34
social impact