Label propagation consists in annotating an unlabeled dataset starting from a set of labeled items. However, most current methods exploit only image similarity between labeled and unlabeled images in order to find propagation candidates, which may result, especially in very large datasets, in retrieving mostly near-duplicate images. While such approaches are technically correct, as they maximize the propagation precision, the resulting annotated dataset may not be as useful, since they lack intra-class variability within the set of images sharing the same label. In this paper, we propose an approach for label propagation which favors the propagation of an object’s label to a set of images representing as many different views of that object as possible, while at the same time preserving the relevance of the retrieved items to the query. Our method is based on a diversity-based clustering technique using a random forest framework and a label propagation approach which is able to effectively and efficiently propagate annotations using a similarity-based approach operating on clusters. The method was tested on a very large dataset of fish images achieving good performance in automated label propagation, ensuring diversification of the annotated items while preserving precision. © 2015 Springer-Verlag Berlin Heidelberg

A diversity-based search approach to support annotation of a large fish image dataset

GIORDANO, Daniela;Palazzo S;SPAMPINATO, CONCETTO
2016-01-01

Abstract

Label propagation consists in annotating an unlabeled dataset starting from a set of labeled items. However, most current methods exploit only image similarity between labeled and unlabeled images in order to find propagation candidates, which may result, especially in very large datasets, in retrieving mostly near-duplicate images. While such approaches are technically correct, as they maximize the propagation precision, the resulting annotated dataset may not be as useful, since they lack intra-class variability within the set of images sharing the same label. In this paper, we propose an approach for label propagation which favors the propagation of an object’s label to a set of images representing as many different views of that object as possible, while at the same time preserving the relevance of the retrieved items to the query. Our method is based on a diversity-based clustering technique using a random forest framework and a label propagation approach which is able to effectively and efficiently propagate annotations using a similarity-based approach operating on clusters. The method was tested on a very large dataset of fish images achieving good performance in automated label propagation, ensuring diversification of the annotated items while preserving precision. © 2015 Springer-Verlag Berlin Heidelberg
2016
Information retrieval; k-NN search; Random forest
File in questo prodotto:
File Dimensione Formato  
Diversity-based search.pdf

solo gestori archivio

Licenza: Non specificato
Dimensione 1.83 MB
Formato Adobe PDF
1.83 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/30962
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact