This dissertation collects all the research work done by the PhD candidate in the Joint Open Lab for Wireless Applications in multi-deVice Ecosystems (JOL WAVE) of TIM Telecom Italia, which sponsored his doctoral fellowship. These applications, in which a big amount of multimedia data is analyzed and summarized, may be the enabling technology for LTE-based multimedial services. Three main categories of media have been treated in this dissertation: images, videos, and 3D data. For images and videos we realized two frameworks: The Social Picture and RECfusion. The Social Picture is a framework to collect and explore huge amount of crowd-sourced social images about public events, cultural sites and other customized private events. RECfusion is designed for automatic video curation driven by the popularity of the scenes acquired by multiple devices. Through these two frameworks the following topics are discussed: Image Matching, Saliency Estimation, and Video and Scene Summarization. Particularly, we investigated advanced image matching techniques (e.g., Compact Descriptors for Visual Search - CDVS). We detailed Content Based Image Retrieval (CBIR) methods, and we described how to compute the heatmap, which represent a valuable tool for analysis and summarization of large images collections. Then, we reach a novel definition of saliency model that we named Social Saliency. This name has been chosen because the model of attention is obtained querying image databases of social media (e.g., Flickr, Instagram, Panoramio), that definitely represent a ``social'' environment. We describe how media collections can be employed for 3D reconstruction and social saliency estimation using images, scene and context tracking using videos, parametrization using 3D medical data, and preservation and restoration using 3D Cultural Heritage data. In all of these applications, we described analysis and summarization methods to be used in high bandwidth connected environments. We present 4 real use-cases, co-authored and published in international journals and conferences: The Social Picture, RECfusion, 3D Data Analysis for Cultural Heritage, and 3D Data Analysis for Medical Research.
Questa tesi di dottorato raccoglie tutto il lavoro di ricerca svolto dal dottorando presso il Joint Open Lab for Wireless Applications in multi-deVice Ecosystems (JOL WAVE CATANIA) di TIM Telecom Italia, che ha sponsorizzato il dottorato. Le applicazioni descritte, in cui una grande quantità di dati multimediali vengono analizzati e riepilogati, possono essere la tecnologia abilitante per i servizi multimediali basati su LTE. Tre principali categorie di media sono state trattate in questa tesi: immagini, video e dati 3D. Per quanto riguarda immagini e video abbiamo realizzato due framework: The Social Picture e RECfusion. The Social Picture è un framework per raccogliere ed esplorare enormi quantità di immagini da social network in modalità crowdsourcing, inerenti ad eventi pubblici, siti culturali e altri eventi privati personalizzati. RECfusion è progettato per gestire automaticamente dei video, in base alla popolarità delle scene acquisite da più dispositivi (ambito multi-device). Attraverso questi due framework vengono discussi i seguenti argomenti: Image Matching, Saliency Estimation, Summarization di video e immagini. In particolare, sono state trattate tecniche avanzate di image matching (ad esempio, Compact Descriptors for Visual Search - CDVS). Vengono descritti nel dettaglio i metodi Content Based Image Retrieval (CBIR) e abbiamo descritto come calcolare la "heatmap", che rappresenta uno strumento prezioso per l'analisi e il riepilogo di raccolte di immagini di grandi dimensioni. Abbiamo quindi fornito una nuova definizione di modello di salienza, chiamato Social Saliency. Questo nome è stato scelto perché il modello di attenzione si ottiene interrogando i database di immagini dei social media (ad es. Flickr, Instagram, Panoramio), che rappresentano un ambiente "social". Abbiamo descritto come le collezioni multimediali di immagini possano essere impiegate per la ricostruzione 3D e la social saliency estimation, come le collezioni multimediali di video possano essere impiegate per il tracciamento delle scene e del contesto, e infine come le collezioni multimediali di dati 3D possano essere impiegate per la parametrizzazione in ambito medico e la conservazione e il restauro per la preservazione dei beni culturali. In tutte queste applicazioni, abbiamo descritto i metodi di analisi e summarization orientati ad ambienti connessi con tecnologia LTE. Sono trattati 4 casi d'uso reali, in cui il dottorando è co-autore in diverse pubblicazioni su riviste e conferenze internazionali: The Social Picture, RECfusion, Analisi dei dati 3D per i beni culturali e Analisi dei dati 3D per la ricerca medica.
Multi-Device Media Analysis and Summarization for High Bandwidth Connected Environment / Milotta, FILIPPO LUIGI MARIA. - (2017 Nov 28).
Multi-Device Media Analysis and Summarization for High Bandwidth Connected Environment
MILOTTA, FILIPPO LUIGI MARIA
2017-11-28
Abstract
This dissertation collects all the research work done by the PhD candidate in the Joint Open Lab for Wireless Applications in multi-deVice Ecosystems (JOL WAVE) of TIM Telecom Italia, which sponsored his doctoral fellowship. These applications, in which a big amount of multimedia data is analyzed and summarized, may be the enabling technology for LTE-based multimedial services. Three main categories of media have been treated in this dissertation: images, videos, and 3D data. For images and videos we realized two frameworks: The Social Picture and RECfusion. The Social Picture is a framework to collect and explore huge amount of crowd-sourced social images about public events, cultural sites and other customized private events. RECfusion is designed for automatic video curation driven by the popularity of the scenes acquired by multiple devices. Through these two frameworks the following topics are discussed: Image Matching, Saliency Estimation, and Video and Scene Summarization. Particularly, we investigated advanced image matching techniques (e.g., Compact Descriptors for Visual Search - CDVS). We detailed Content Based Image Retrieval (CBIR) methods, and we described how to compute the heatmap, which represent a valuable tool for analysis and summarization of large images collections. Then, we reach a novel definition of saliency model that we named Social Saliency. This name has been chosen because the model of attention is obtained querying image databases of social media (e.g., Flickr, Instagram, Panoramio), that definitely represent a ``social'' environment. We describe how media collections can be employed for 3D reconstruction and social saliency estimation using images, scene and context tracking using videos, parametrization using 3D medical data, and preservation and restoration using 3D Cultural Heritage data. In all of these applications, we described analysis and summarization methods to be used in high bandwidth connected environments. We present 4 real use-cases, co-authored and published in international journals and conferences: The Social Picture, RECfusion, 3D Data Analysis for Cultural Heritage, and 3D Data Analysis for Medical Research.File | Dimensione | Formato | |
---|---|---|---|
MILOTTA_TESIPHD_FINAL_281117.pdf
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
PUBBLICO - Pubblico con Copyright
Dimensione
60.42 MB
Formato
Adobe PDF
|
60.42 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.