Analysis of matrix-variate data is becoming ever more prevalent in the literature, especially in the area of clustering and classification. Real data, including real matrix-variate data, are often contaminated by potential outlying observations. Their detection, as well as the development of models insensitive to their presence, is particularly important for this type of data because of the practical issues concerning their effective visualization. Herein, the matrix-variate contaminated normal distribution is discussed and then utilized in the mixture model paradigm for clustering. One key advantage of the proposed model is the ability to automatically detect potential outlying matrices by computing their a posteriori probability of being typical or atypical. Such detection is currently unavailable using existing matrix-variate methods. An expectation conditional maximization algorithm is used for parameter estimation, and both simulated and real data are used for illustration. Supplementary files for this article are available online.

Mixtures of Matrix-Variate Contaminated Normal Distributions

Tomarchio S. D.;Punzo A.;
2022-01-01

Abstract

Analysis of matrix-variate data is becoming ever more prevalent in the literature, especially in the area of clustering and classification. Real data, including real matrix-variate data, are often contaminated by potential outlying observations. Their detection, as well as the development of models insensitive to their presence, is particularly important for this type of data because of the practical issues concerning their effective visualization. Herein, the matrix-variate contaminated normal distribution is discussed and then utilized in the mixture model paradigm for clustering. One key advantage of the proposed model is the ability to automatically detect potential outlying matrices by computing their a posteriori probability of being typical or atypical. Such detection is currently unavailable using existing matrix-variate methods. An expectation conditional maximization algorithm is used for parameter estimation, and both simulated and real data are used for illustration. Supplementary files for this article are available online.
2022
Contaminated distributions
Matrix-variate distributions
Mixture models
File in questo prodotto:
File Dimensione Formato  
2022 - Tomarchio & Gallaugher & Punzo & McNicholas - Mixtures of Matrix-Variate Contaminated Normal Distributions.pdf

solo gestori archivio

Descrizione: Articolo su rivista
Tipologia: Versione Editoriale (PDF)
Dimensione 2.21 MB
Formato Adobe PDF
2.21 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/523703
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 14
  • ???jsp.display-item.citation.isi??? ND
social impact