Visual Localization is gathering more and more attention in computer vision due to the spread of wearable cameras (e.g. smart glasses) and to the increase of general interest in autonomous vehicles and robots. Unfortunately, current localization algorithms rely on large amounts of labeled training data collected in the specific target environment in which the system needs to work. Data collection and labeling in this context is difficult and time-consuming. Moreover, the process has to be repeated when the system is adapted to a new environment. In this work, we consider a scenario in which the target environment has been scanned to obtain a 3D model of the scene suitable to generate large quantities of synthetic data automatically paired with localization labels. We hence investigate the use of Unsupervised Domain Adaptation techniques exploiting labeled synthetic data and unlabeled real data to train localization algorithms. To carry out the study, we introduce a new dataset composed of synthetic and real images labeled with their 6-DOF poses collected in four different indoor rooms which is available at https://iplab.dmi.unict.it/EGO-CH-LOC-UDA. A new method based on self-supervision and attention modules is hence proposed and tested on the proposed dataset. Results show that our method improves over baselines and state-of-the-art algorithms tackling similar domain adaptation tasks.
Unsupervised domain adaptation for 6DOF indoor localization
Furnari A.;Signorello G.;Farinella G. M.
2021-01-01
Abstract
Visual Localization is gathering more and more attention in computer vision due to the spread of wearable cameras (e.g. smart glasses) and to the increase of general interest in autonomous vehicles and robots. Unfortunately, current localization algorithms rely on large amounts of labeled training data collected in the specific target environment in which the system needs to work. Data collection and labeling in this context is difficult and time-consuming. Moreover, the process has to be repeated when the system is adapted to a new environment. In this work, we consider a scenario in which the target environment has been scanned to obtain a 3D model of the scene suitable to generate large quantities of synthetic data automatically paired with localization labels. We hence investigate the use of Unsupervised Domain Adaptation techniques exploiting labeled synthetic data and unlabeled real data to train localization algorithms. To carry out the study, we introduce a new dataset composed of synthetic and real images labeled with their 6-DOF poses collected in four different indoor rooms which is available at https://iplab.dmi.unict.it/EGO-CH-LOC-UDA. A new method based on self-supervision and attention modules is hence proposed and tested on the proposed dataset. Results show that our method improves over baselines and state-of-the-art algorithms tackling similar domain adaptation tasks.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.