University dropout rates remain a significant concern for educational institutions, impacting both student outcomes and institutional performance. This study aims to develop a predictive model to identify students at risk of dropping out using various features extracted from historical data. The dataset used in this study includes data from 42,000 students enrolled in undergraduate programs at the University of Catania, covering a period of five academic years and spanning various university departments. We train and compare the performance of four different machine learning models (Logistic Regression: LR, Random Forest: RF, eXtreme Gradient Boosting: XGB, and Neural Network: NN). The XGBoost model achieved the highest accuracy at 97%, followed by the Random Forest with 94%.
Predicting University Dropout Rates Using Machine Learning: UniCt Case
Miracula Vincenzo
;Mazzeo Rinaldi Francesco;Giuffrida Giovanni
2025-01-01
Abstract
University dropout rates remain a significant concern for educational institutions, impacting both student outcomes and institutional performance. This study aims to develop a predictive model to identify students at risk of dropping out using various features extracted from historical data. The dataset used in this study includes data from 42,000 students enrolled in undergraduate programs at the University of Catania, covering a period of five academic years and spanning various university departments. We train and compare the performance of four different machine learning models (Logistic Regression: LR, Random Forest: RF, eXtreme Gradient Boosting: XGB, and Neural Network: NN). The XGBoost model achieved the highest accuracy at 97%, followed by the Random Forest with 94%.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.