We introduce a joint approach to time-varying clustering and bad points detection under a longitudinal setting, extending the standard Gaussian Hidden Markov model (N-HMM). We replace the multivariate Gaussian state-dependent distribution with a two-component Gaussian mixture where one mixture (reference) component represents the data we would expect from the given state (i.e. good points) while the other mixture component clusters the atypical data and has a small prior probability, the same component-specific mean and an inflated covariance matrix. This change makes the model, hereafter named CN-HMM, much more robust. We estimate model parameters by using an ad hoc version of the expectation-conditional maximization (ECM) algorithm, extending the Baum-Welch iterative procedure to deal with contaminated Gaussian distributions. We illustrate the proposal by analyzing a longitudinal dataset of Italian provinces on which four different crimes rates have been measured from 2005 to 2009.

On the use of the contaminated Gaussian distribution in hidden Markov models for longitudinal data

PUNZO, ANTONIO;
2015-01-01

Abstract

We introduce a joint approach to time-varying clustering and bad points detection under a longitudinal setting, extending the standard Gaussian Hidden Markov model (N-HMM). We replace the multivariate Gaussian state-dependent distribution with a two-component Gaussian mixture where one mixture (reference) component represents the data we would expect from the given state (i.e. good points) while the other mixture component clusters the atypical data and has a small prior probability, the same component-specific mean and an inflated covariance matrix. This change makes the model, hereafter named CN-HMM, much more robust. We estimate model parameters by using an ad hoc version of the expectation-conditional maximization (ECM) algorithm, extending the Baum-Welch iterative procedure to deal with contaminated Gaussian distributions. We illustrate the proposal by analyzing a longitudinal dataset of Italian provinces on which four different crimes rates have been measured from 2005 to 2009.
2015
978-88-8467-949-9
Hidden Markov Models; EM algorithm; Contaminated Gaussian Distribution
File in questo prodotto:
File Dimensione Formato  
Punzo & Maruotti - CLADAG 2015.pdf

solo gestori archivio

Licenza: Non specificato
Dimensione 758.85 kB
Formato Adobe PDF
758.85 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/97984
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact