The Cluster-Weighted Model (CWM) is a member of the family of the Mixtures of Regression Models and it is also referred in literature to as Mixture of Regression with Random Covariates. Attention on CWMs is increasing and software for estimating this kind of models is available to R users but not for Stata users. The aim of this article is to introduce the Stata package cwmglm that allows to fit CWMs based on the most common generalized linear models with random covariates. Moreover, cwmglm allows to estimate parsimonious models of Gaussian distributions where the parametrization of variance-covariance matrix is based on the eigenvalue decomposition. The package cwmglm introduces also new features not available in other existing software; in particular, it extends the current capabilities in the estimation of CWMs by allowing users also to evaluate model fit by introducing the generalized determination coefficients and by incorporating bootstrap-based inference. We illustrate the use cwmglm with a real dataset regarding Covid-19 admissions from February 2020 to June 2020 in Northern Italy. We explore the latent heterogeneity in such admissions by considering different candidate CWMs. Other than modeling latent groups, the conditional density of the length of stay is modeled using the number of comorbidities, the number of procedures , the period of admission from the start of the pandemics, age and sex as covariates. Covariates are also assigned a marginal distribution. Other examples include a real data on university students, and simulated data.
A Stata package for Cluster Weighted Modeling
Giorgio VittadiniMethodology
;Salvatore IngrassiaMethodology
2023-01-01
Abstract
The Cluster-Weighted Model (CWM) is a member of the family of the Mixtures of Regression Models and it is also referred in literature to as Mixture of Regression with Random Covariates. Attention on CWMs is increasing and software for estimating this kind of models is available to R users but not for Stata users. The aim of this article is to introduce the Stata package cwmglm that allows to fit CWMs based on the most common generalized linear models with random covariates. Moreover, cwmglm allows to estimate parsimonious models of Gaussian distributions where the parametrization of variance-covariance matrix is based on the eigenvalue decomposition. The package cwmglm introduces also new features not available in other existing software; in particular, it extends the current capabilities in the estimation of CWMs by allowing users also to evaluate model fit by introducing the generalized determination coefficients and by incorporating bootstrap-based inference. We illustrate the use cwmglm with a real dataset regarding Covid-19 admissions from February 2020 to June 2020 in Northern Italy. We explore the latent heterogeneity in such admissions by considering different candidate CWMs. Other than modeling latent groups, the conditional density of the length of stay is modeled using the number of comorbidities, the number of procedures , the period of admission from the start of the pandemics, age and sex as covariates. Covariates are also assigned a marginal distribution. Other examples include a real data on university students, and simulated data.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.