Indoor localization has gained significant attention in recent years due to its applications across sectors such as healthcare, logistics, manufacturing, and retail. However, while outdoor localization has been effectively addressed with GPS, indoor localization remains challenging despite significant research progress. Many studies have explored the capabilities of modern smartphones, equipped with a variety of sensors, to develop machine-learning methods for indoor localization, ranging from classical fingerprinting to deep sequence models and transformers. Nevertheless, most rely on small, proprietary datasets that are not publicly available. Large, high-quality public datasets are essential for researchers to efficiently test, refine, and validate algorithms, enable comparisons between different approaches and develop robust and accurate localization solutions. To reduce data collection time and costs and help researchers find the most appropriate datasets for their needs, this paper surveys 20 publicly available high-quality indoor localization datasets suitable for Machine Learning, released between 2014 and 2024, that cover various sensing technologies. The survey reveals a shift toward multi-sensor data collection, extending beyond Wi-Fi and Bluetooth signals to include inertial sensors such as accelerometers and gyroscopes, as well as magnetic fields. It also highlights that while over 75% of datasets cover multi-floor structures or multiple buildings, there is a scarcity of datasets covering diverse types of indoor environments, with most focused on office or academic settings. Moreover, the temporal dimension, crucial in dynamic indoor scenarios, remains largely underrepresented, limiting the development of ML models for tracking dynamic trajectories or adapting to evolving signal patterns.

Survey of smartphone-based datasets for indoor localization: A machine learning perspective

Gaetano Carmelo La Delfa
Primo
Investigation
;
Hamaad Rafique
Investigation
;
Maurizio Palesi
Supervision
;
Davide Patti
Supervision
2025-01-01

Abstract

Indoor localization has gained significant attention in recent years due to its applications across sectors such as healthcare, logistics, manufacturing, and retail. However, while outdoor localization has been effectively addressed with GPS, indoor localization remains challenging despite significant research progress. Many studies have explored the capabilities of modern smartphones, equipped with a variety of sensors, to develop machine-learning methods for indoor localization, ranging from classical fingerprinting to deep sequence models and transformers. Nevertheless, most rely on small, proprietary datasets that are not publicly available. Large, high-quality public datasets are essential for researchers to efficiently test, refine, and validate algorithms, enable comparisons between different approaches and develop robust and accurate localization solutions. To reduce data collection time and costs and help researchers find the most appropriate datasets for their needs, this paper surveys 20 publicly available high-quality indoor localization datasets suitable for Machine Learning, released between 2014 and 2024, that cover various sensing technologies. The survey reveals a shift toward multi-sensor data collection, extending beyond Wi-Fi and Bluetooth signals to include inertial sensors such as accelerometers and gyroscopes, as well as magnetic fields. It also highlights that while over 75% of datasets cover multi-floor structures or multiple buildings, there is a scarcity of datasets covering diverse types of indoor environments, with most focused on office or academic settings. Moreover, the temporal dimension, crucial in dynamic indoor scenarios, remains largely underrepresented, limiting the development of ML models for tracking dynamic trajectories or adapting to evolving signal patterns.
2025
Datasets
Deep learning
Indoor localization
Indoor navigation
Indoor positioning
Machine learning
Smartphone sensors
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/684629
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact