Indoor localization, powered by the Internet of Things (IoT)-based sensor fusion, represents an evolving application that uses mobile sensors and embedded hardware within buildings to pinpoint smartphone users’ locations. However, sensor heterogeneity on various smartphones has significantly compromised the reliability and accuracy of localization algorithms. To address this critical challenge, this paper introduces MH-ViL, an infrastructure-free and calibration-free framework built on top of a multi-head self-attention and vision transformer neural network. MH-ViL seamlessly integrates magnetic field signals (MFS) and visual images for localization tasks. A novel magnetic feature projection (MFP) model is proposed to effectively map MFS onto visual image features, enhancing positional accuracy within the self-attention mechanism. Extensive real-time experiments demonstrate that MH-ViL surpasses alternative models, achieving an impressive 92% accuracy. We provide a confidence interval of 95%, indicating predictions that fall within the range of 88% to 92%, with an error rate of less than 0.5 meters. Furthermore, the algorithm undergoes evaluation in both homogeneous and heterogeneous environments to assess its generalizability.
Indoor localization by projecting magnetic field signals onto images with vision transformer
Rafique, Hamaad
;Patti, Davide;Palesi, Maurizio
;La Delfa, Gaetano Carmelo
2025-01-01
Abstract
Indoor localization, powered by the Internet of Things (IoT)-based sensor fusion, represents an evolving application that uses mobile sensors and embedded hardware within buildings to pinpoint smartphone users’ locations. However, sensor heterogeneity on various smartphones has significantly compromised the reliability and accuracy of localization algorithms. To address this critical challenge, this paper introduces MH-ViL, an infrastructure-free and calibration-free framework built on top of a multi-head self-attention and vision transformer neural network. MH-ViL seamlessly integrates magnetic field signals (MFS) and visual images for localization tasks. A novel magnetic feature projection (MFP) model is proposed to effectively map MFS onto visual image features, enhancing positional accuracy within the self-attention mechanism. Extensive real-time experiments demonstrate that MH-ViL surpasses alternative models, achieving an impressive 92% accuracy. We provide a confidence interval of 95%, indicating predictions that fall within the range of 88% to 92%, with an error rate of less than 0.5 meters. Furthermore, the algorithm undergoes evaluation in both homogeneous and heterogeneous environments to assess its generalizability.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.