Mixtures of multivariate contaminated shifted asymmetric Laplace distributions are developed for handling asymmetric clusters in the presence of outliers (also referred to as bad points herein). In addition to the parameters of the related non-contaminated mixture, for each (asymmetric) cluster, our model has one parameter controlling the proportion of outliers and another specifying the degree of contamination. Crucially, these parameters do not have to be specified a priori, adding a flexibility to our approach that is absent from other approaches such as trimming. Moreover, each observation is given an a posteriori probability of belonging to a particular cluster, and of being an outlier or not; advantageously, this allows for the automatic detection of outliers. An expectation–conditional maximization algorithm is outlined for parameter estimation and various implementation issues are discussed. The behavior of the proposed model is investigated, and compared with well-established finite mixture approaches, on artificial and real data.

Asymmetric clusters and outliers: Mixtures of multivariate contaminated shifted asymmetric Laplace distributions

Punzo A.
;
2019-01-01

Abstract

Mixtures of multivariate contaminated shifted asymmetric Laplace distributions are developed for handling asymmetric clusters in the presence of outliers (also referred to as bad points herein). In addition to the parameters of the related non-contaminated mixture, for each (asymmetric) cluster, our model has one parameter controlling the proportion of outliers and another specifying the degree of contamination. Crucially, these parameters do not have to be specified a priori, adding a flexibility to our approach that is absent from other approaches such as trimming. Moreover, each observation is given an a posteriori probability of belonging to a particular cluster, and of being an outlier or not; advantageously, this allows for the automatic detection of outliers. An expectation–conditional maximization algorithm is outlined for parameter estimation and various implementation issues are discussed. The behavior of the proposed model is investigated, and compared with well-established finite mixture approaches, on artificial and real data.
2019
Mixture models; Model-based clustering; Outlier detection; Shifted asymmetric Laplace distribution; Statistics and Probability; Computational Mathematics; Computational Theory and Mathematics; Applied Mathematics
File in questo prodotto:
File Dimensione Formato  
Morris, Punzo, McNicholas & Browne (2019) - CSDA.pdf

solo gestori archivio

Descrizione: Articolo principale
Tipologia: Versione Editoriale (PDF)
Dimensione 1.27 MB
Formato Adobe PDF
1.27 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/361461
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 22
  • ???jsp.display-item.citation.isi??? 18
social impact