Wearable cameras allow to easily acquire long and unstructured egocentric videos. In this context, temporal video segmentation methods can be useful to improve indexing, retrieval and summarization of such content. While past research investigated methods for temporal segmentation of egocentric videos according to diferent criteria (e.g., motion, location or appearance), many of them do not explicitly enforce any form of temporal coherence. Moreover, evaluations have been generally performed using frame-based measures, which only account for the overall correctness of predicted frames, overlooking the structure of the produced segmentation. In this paper, we investigate how a Hidden Markov Model based on an ad-hoc transition matrix can be exploited to obtain a more accurate segmentation from frame-based predictions in the context of location-based segmentation of egocentric videos. We introduce a segment-based evaluation measure which strongly penalizes oversegmented and under-segmented results. Experiments show that the exploitation of a Hidden Markov Model for temporal smoothing greatly improves temporal segmentation results and outperforms current video segmentation methods designed for both third-person and first-person videos.
|Titolo:||On the exploitation of Hidden Markov models to improve location-based temporal segmentation of egocentric videos|
|Data di pubblicazione:||2017|
|Appare nelle tipologie:||4.1 Contributo in Atti di convegno|