The huge volume of images shared in the web sites and on personal archives has provided us challenges on massive multimedia management. Due to the well-known semantic gap between human-understandable high-level semantics and machine generated low-level features, recent years have witnessed plenty of research effort on multimedia content understanding and indexing. Computer vision algorithms for individual tasks such as object recognition, detection and segmentation have reached impressive results. The next challenge is to integrate all these algorithms and address the problem of the complete scene understanding, which involves explaining the image by recognizing all the objects of interest and their spatial extent or shape. True semantic understanding of an image mainly involves the scene classification and the semantic segmentation. The former has the aim to determinate the categories to which an image belongs. The later instead, provide for each pixel a semantic label, which describes the category of object where it appears. Solutions for the semantic interpretation and understanding of images will enable and enhance large variety of computer vision applications. While a human can do these tasks easily, it is laborious and the sheer quantity of data involved can make it prohibitive for a computer. This thesis proposes novel approaches for semantic scene categorization, segmentation and retrieval that enable a device with a limited amount of resources to understand images automatically. The proposed computer vision solutions use machine-learning algorithms to build robust and reusable systems. Since learning is a key component of biological vision systems, the design of automatic artificial systems that are capable to learn, is one of the most important trends in modern computer vision research.

True scene understanding: classification, semantic segmentation and retrieval / Ravi', Daniele. - (2013 Dec 10).

True scene understanding: classification, semantic segmentation and retrieval

RAVI', DANIELE
2013-12-10

Abstract

The huge volume of images shared in the web sites and on personal archives has provided us challenges on massive multimedia management. Due to the well-known semantic gap between human-understandable high-level semantics and machine generated low-level features, recent years have witnessed plenty of research effort on multimedia content understanding and indexing. Computer vision algorithms for individual tasks such as object recognition, detection and segmentation have reached impressive results. The next challenge is to integrate all these algorithms and address the problem of the complete scene understanding, which involves explaining the image by recognizing all the objects of interest and their spatial extent or shape. True semantic understanding of an image mainly involves the scene classification and the semantic segmentation. The former has the aim to determinate the categories to which an image belongs. The later instead, provide for each pixel a semantic label, which describes the category of object where it appears. Solutions for the semantic interpretation and understanding of images will enable and enhance large variety of computer vision applications. While a human can do these tasks easily, it is laborious and the sheer quantity of data involved can make it prohibitive for a computer. This thesis proposes novel approaches for semantic scene categorization, segmentation and retrieval that enable a device with a limited amount of resources to understand images automatically. The proposed computer vision solutions use machine-learning algorithms to build robust and reusable systems. Since learning is a key component of biological vision systems, the design of automatic artificial systems that are capable to learn, is one of the most important trends in modern computer vision research.
10-dic-2013
Scene; Understanding; Classification; Semantic; Segmentation; Retrieval; Mobile; devices; DCT
True scene understanding: classification, semantic segmentation and retrieval / Ravi', Daniele. - (2013 Dec 10).
File in questo prodotto:
File Dimensione Formato  
main.pdf

accesso aperto

Tipologia: Tesi di dottorato
Licenza: PUBBLICO - Pubblico con Copyright
Dimensione 11.97 MB
Formato Adobe PDF
11.97 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/587073
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact