The Cluster-Weighted Model (CWM) is a member of the family of the Mixtures of Regression Models and it is also referred in literature to as Mixture of Regression with Random Covariates. Attention on CWMs is increasing and software for estimating this kind of models is available to R users but not for Stata users. The aim of this article is to introduce the Stata package cwmglm that allows to fit CWMs based on the most common generalized linear models with random covariates. Moreover, cwmglm allows to estimate parsimonious models of Gaussian distributions where the parametrization of variance-covariance matrix is based on the eigenvalue decomposition. The package cwmglm introduces also new features not available in other existing software; in particular, it extends the current capabilities in the estimation of CWMs by allowing users also to evaluate model fit by introducing the generalized determination coefficients and by incorporating bootstrap-based inference. We illustrate the use cwmglm with a real dataset regarding Covid-19 admissions from February 2020 to June 2020 in Northern Italy. We explore the latent heterogeneity in such admissions by considering different candidate CWMs. Other than modeling latent groups, the conditional density of the length of stay is modeled using the number of comorbidities, the number of procedures , the period of admission from the start of the pandemics, age and sex as covariates. Covariates are also assigned a marginal distribution. Other examples include a real data on university students, and simulated data.

A Stata package for Cluster Weighted Modeling

Giorgio Vittadini
Methodology
;
Salvatore Ingrassia
Methodology
2023-01-01

Abstract

The Cluster-Weighted Model (CWM) is a member of the family of the Mixtures of Regression Models and it is also referred in literature to as Mixture of Regression with Random Covariates. Attention on CWMs is increasing and software for estimating this kind of models is available to R users but not for Stata users. The aim of this article is to introduce the Stata package cwmglm that allows to fit CWMs based on the most common generalized linear models with random covariates. Moreover, cwmglm allows to estimate parsimonious models of Gaussian distributions where the parametrization of variance-covariance matrix is based on the eigenvalue decomposition. The package cwmglm introduces also new features not available in other existing software; in particular, it extends the current capabilities in the estimation of CWMs by allowing users also to evaluate model fit by introducing the generalized determination coefficients and by incorporating bootstrap-based inference. We illustrate the use cwmglm with a real dataset regarding Covid-19 admissions from February 2020 to June 2020 in Northern Italy. We explore the latent heterogeneity in such admissions by considering different candidate CWMs. Other than modeling latent groups, the conditional density of the length of stay is modeled using the number of comorbidities, the number of procedures , the period of admission from the start of the pandemics, age and sex as covariates. Covariates are also assigned a marginal distribution. Other examples include a real data on university students, and simulated data.
2023
Clustering covid 19 admissions
Cluster weighted models
Stata
Latent clusters
Mixtures of regressions
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/618451
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact