Sales forecasting is a fundamental task in retail, essential for inventory management, demand planning, and operational efficiency. This paper presents a comparative study of forecasting models on the M5 dataset, covering both aggregated time series (by \texttt{cat\_id} and \texttt{store\_id}) and more disaggregated series (by \texttt{cat\_id} only). The evaluation includes classical statistical models (ETS, SARIMAX), machine learning algorithms (LightGBM, XGBoost, MLP, KerasMLP), and deep learning approaches (Transformer), tested with and without exogenous variables such as calendar indicators, price data, and event features. Results show that machine learning models—particularly MLP and LightGBM—perform well on aggregated series, benefiting from additional input features. However, their advantage diminishes at the disaggregated level, where the ETS model achieves the lowest average errors despite relying solely on historical sales data. This suggests that much of the predictive signal in retail data can be captured through temporal dynamics alone. The study highlights a trade-off between model complexity and forecasting performance. While advanced models offer flexibility and gains in specific contexts, simpler statistical methods like ETS provide robust, interpretable, and efficient alternatives—especially valuable in scenarios with limited data or operational constraints
Some Insights into Machine Learning for Retail Sales Forecasting
Giuseppe Nunnari
2025-01-01
Abstract
Sales forecasting is a fundamental task in retail, essential for inventory management, demand planning, and operational efficiency. This paper presents a comparative study of forecasting models on the M5 dataset, covering both aggregated time series (by \texttt{cat\_id} and \texttt{store\_id}) and more disaggregated series (by \texttt{cat\_id} only). The evaluation includes classical statistical models (ETS, SARIMAX), machine learning algorithms (LightGBM, XGBoost, MLP, KerasMLP), and deep learning approaches (Transformer), tested with and without exogenous variables such as calendar indicators, price data, and event features. Results show that machine learning models—particularly MLP and LightGBM—perform well on aggregated series, benefiting from additional input features. However, their advantage diminishes at the disaggregated level, where the ETS model achieves the lowest average errors despite relying solely on historical sales data. This suggests that much of the predictive signal in retail data can be captured through temporal dynamics alone. The study highlights a trade-off between model complexity and forecasting performance. While advanced models offer flexibility and gains in specific contexts, simpler statistical methods like ETS provide robust, interpretable, and efficient alternatives—especially valuable in scenarios with limited data or operational constraintsI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.