The huge diffusion of mobile devices with embedded cameras has opened new challenges in the context of the automatic understanding of video streams acquired by multiple users during events, such as sport matches, expos, concerts. Among the other goals there is the interpretation of which visual contents are the most relevant and popular (i.e., where users look). The popularity of a visual content is an important cue exploitable in several fields that include the estimation of the mood of the crowds attending to an event, the estimation of the interest of parts of a cultural heritage, etc. In live social events people capture and share videos which are related to the event. The popularity of a visual content can be obtained through the “visual consensus” among multiple video streams acquired by the different users devices. In this paper we address the problem of detecting and summarizing the “popular scenes” captured by users with a mobile camera during events. For this purpose, we have developed a framework called RECfusion in which the key popular scenes of multiple streams are identified over time. The proposed system is able to generate a video which captures the interests of the crowd starting from a set of the videos by considering scene content popularity. The frames composing the final popular video are automatically selected from the different video streams by considering the scene recorded by the highest number of users’ devices (i.e., the most popular scene).

Organizing videos streams for clustering and estimation of popular scenes

Battiato, Sebastiano;Farinella, Giovanni M.;Milotta, Filippo;Ortis, Alessandro;Stanco, Filippo;
2017-01-01

Abstract

The huge diffusion of mobile devices with embedded cameras has opened new challenges in the context of the automatic understanding of video streams acquired by multiple users during events, such as sport matches, expos, concerts. Among the other goals there is the interpretation of which visual contents are the most relevant and popular (i.e., where users look). The popularity of a visual content is an important cue exploitable in several fields that include the estimation of the mood of the crowds attending to an event, the estimation of the interest of parts of a cultural heritage, etc. In live social events people capture and share videos which are related to the event. The popularity of a visual content can be obtained through the “visual consensus” among multiple video streams acquired by the different users devices. In this paper we address the problem of detecting and summarizing the “popular scenes” captured by users with a mobile camera during events. For this purpose, we have developed a framework called RECfusion in which the key popular scenes of multiple streams are identified over time. The proposed system is able to generate a video which captures the interests of the crowd starting from a set of the videos by considering scene content popularity. The frames composing the final popular video are automatically selected from the different video streams by considering the scene recorded by the highest number of users’ devices (i.e., the most popular scene).
2017
9783319685595
Clustering; Scene understanding; Social cameras; Video analysis; Theoretical Computer Science; Computer Science (all)
File in questo prodotto:
File Dimensione Formato  
come apparso in rivista.pdf

solo gestori archivio

Tipologia: Versione Editoriale (PDF)
Dimensione 886.68 kB
Formato Adobe PDF
886.68 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/313706
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 2
social impact