The expectation–maximization (EM) algorithm is a familiar tool for computing the maximum likelihood estimate of the parameters in hidden Markov and semi-Markov models. This paper carries out a detailed study on the influence that the initial values of the parameters impose on the results produced by the algorithm. We compare random starts and partitional and model-based strategies for choosing the initial values for the EM algorithm in the case of multivariate Gaussian emission distributions (EDs) and assess the performance of each strategy with different assessment criteria. Several data generation settings are considered with varying number of latent states, of variables as well as of the level of fuzziness in the data, and discussion on how each factor influences the obtained results is provided. Simulation results show that different initialization strategies may lead to different log-likelihood values and, accordingly, to different estimated partitions. A clear indication of which strategies should be preferred is given. We further include two real-data examples, widely analysed in the hidden semi-Markov model literature.
Initialization of Hidden Markov and Semi-Markov Models: A Critical Evaluation of Several Strategies
Punzo A.
2021-01-01
Abstract
The expectation–maximization (EM) algorithm is a familiar tool for computing the maximum likelihood estimate of the parameters in hidden Markov and semi-Markov models. This paper carries out a detailed study on the influence that the initial values of the parameters impose on the results produced by the algorithm. We compare random starts and partitional and model-based strategies for choosing the initial values for the EM algorithm in the case of multivariate Gaussian emission distributions (EDs) and assess the performance of each strategy with different assessment criteria. Several data generation settings are considered with varying number of latent states, of variables as well as of the level of fuzziness in the data, and discussion on how each factor influences the obtained results is provided. Simulation results show that different initialization strategies may lead to different log-likelihood values and, accordingly, to different estimated partitions. A clear indication of which strategies should be preferred is given. We further include two real-data examples, widely analysed in the hidden semi-Markov model literature.| File | Dimensione | Formato | |
|---|---|---|---|
|
Maruotti & Punzo (2021) - ISR.pdf
solo gestori archivio
Descrizione: Articolo principale
Tipologia:
Versione Editoriale (PDF)
Dimensione
9.54 MB
Formato
Adobe PDF
|
9.54 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


