Mixtures of regression models (MRMs) are widely used to investigate the relationship between variables coming from several unknown latent homogeneous groups. Usually, the conditional distribution of the response in each mixture component is assumed to be (multivariate) normal (MN-MRM). To robustify the approach with respect to possible elliptical heavy-tailed departures from normality, due to the presence of mild outliers, the multivariate contaminated normal MRM is here introduced. In addition to the parameters of the MN-MRM, each mixture component has a parameter controlling the proportion of outliers and one specifying the degree of contamination with respect to the response variable(s). Crucially, these parameters do not have to be specified a priori, adding flexibility to our approach. Furthermore, once the model is estimated and the observations are assigned to the groups, a finer intra-group classification in typical points and (mild) outliers, can be directly obtained. Identifiability conditions are provided, an expectation-conditional maximization algorithm is outlined for parameter estimation, and various implementation and operational issues are discussed. Properties of the estimators of the regression coefficients are evaluated through Monte Carlo experiments and compared with other procedures. The performance of this novel family of models is also illustrated on artificial and real data, with particular emphasis to the application in allometric studies.
|Titolo:||Mixtures of multivariate contaminated normal regression models|
PUNZO, ANTONIO (Corresponding)
|Data di pubblicazione:||2020|
|Appare nelle tipologie:||1.1 Articolo in rivista|