The ability to detect objects in cultural sites from the egocentric point of view of the user can enable interesting applications for both the visitors and the manager of the site. Unfortunately, current object detection algorithms have to be trained on large amounts of labeled data, the collection of which is costly and time-consuming. While synthetic data generated from the 3D model of the cultural site can be used to train object detection algorithms, a significant drop in performance is generally observed when such algorithms are deployed to work with real images. In this paper, we consider the problem of unsupervised domain adaptation for object detection in cultural sites. Specifically, we assume the availability of synthetic labeled images and real unlabeled images for training. To study the problem, we propose a dataset containing 75244 synthetic and 2190 real images with annotations for 16 different artworks. We hence investigate different domain adaptation techniques based on image-to-image translation and feature alignment. Our analysis points out that such techniques can be useful to address the domain adaptation issue, while there is still plenty of space for improvement on the proposed dataset. We release the dataset at our web page to encourage research on this challenging topic: https://iplab.dmi.unict.it/EGO-CH-OBJ-ADAPT/.
Unsupervised domain adaptation for object detection in cultural sites
Pasqualino G.;Furnari A.;Farinella G. M.
2020-01-01
Abstract
The ability to detect objects in cultural sites from the egocentric point of view of the user can enable interesting applications for both the visitors and the manager of the site. Unfortunately, current object detection algorithms have to be trained on large amounts of labeled data, the collection of which is costly and time-consuming. While synthetic data generated from the 3D model of the cultural site can be used to train object detection algorithms, a significant drop in performance is generally observed when such algorithms are deployed to work with real images. In this paper, we consider the problem of unsupervised domain adaptation for object detection in cultural sites. Specifically, we assume the availability of synthetic labeled images and real unlabeled images for training. To study the problem, we propose a dataset containing 75244 synthetic and 2190 real images with annotations for 16 different artworks. We hence investigate different domain adaptation techniques based on image-to-image translation and feature alignment. Our analysis points out that such techniques can be useful to address the domain adaptation issue, while there is still plenty of space for improvement on the proposed dataset. We release the dataset at our web page to encourage research on this challenging topic: https://iplab.dmi.unict.it/EGO-CH-OBJ-ADAPT/.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.