Automatic food understanding from images is an interesting challenge with applications in different domains. In particular, food intake monitoring is becoming more and more important because of the key role that it plays in health and market economies. In this paper, we address the study of food image processing from the perspective of Computer Vision. As first contribution we present a survey of the studies in the context of food image processing from the early attempts to the current state-of-the-art methods. Since retrieval and classification engines able to work on food images are required to build automatic systems for diet monitoring (e.g., to be embedded in wearable cameras), we focus our attention on the aspect of the representation of the food images because it plays a fundamental role in the understanding engines. The food retrieval and classification is a challenging task since the food presents high variableness and an intrinsic deformability. To properly study the peculiarities of different image representations we propose the UNICT-FD1200 dataset. It was composed of 4754 food images of 1200 distinct dishes acquired during real meals. Each food plate is acquired multiple times and the overall dataset presents both geometric and photometric variabilities. The images of the dataset have been manually labeled considering 8 categories: Appetizer, Main Course, Second Course, Single Course, Side Dish, Dessert, Breakfast, Fruit. We have performed tests employing different representations of the state-of-the-art to assess the related performances on the UNICT-FD1200 dataset. Finally, we propose a new representation based on the perceptual concept of Anti-Textons which is able to encode spatial information between Textons outperforming other representations in the context of food retrieval and Classification.

Retrieval and Classification of Food Images

FARINELLA, GIOVANNI MARIA;D. Allegra;STANCO, FILIPPO;BATTIATO, SEBASTIANO
2016-01-01

Abstract

Automatic food understanding from images is an interesting challenge with applications in different domains. In particular, food intake monitoring is becoming more and more important because of the key role that it plays in health and market economies. In this paper, we address the study of food image processing from the perspective of Computer Vision. As first contribution we present a survey of the studies in the context of food image processing from the early attempts to the current state-of-the-art methods. Since retrieval and classification engines able to work on food images are required to build automatic systems for diet monitoring (e.g., to be embedded in wearable cameras), we focus our attention on the aspect of the representation of the food images because it plays a fundamental role in the understanding engines. The food retrieval and classification is a challenging task since the food presents high variableness and an intrinsic deformability. To properly study the peculiarities of different image representations we propose the UNICT-FD1200 dataset. It was composed of 4754 food images of 1200 distinct dishes acquired during real meals. Each food plate is acquired multiple times and the overall dataset presents both geometric and photometric variabilities. The images of the dataset have been manually labeled considering 8 categories: Appetizer, Main Course, Second Course, Single Course, Side Dish, Dessert, Breakfast, Fruit. We have performed tests employing different representations of the state-of-the-art to assess the related performances on the UNICT-FD1200 dataset. Finally, we propose a new representation based on the perceptual concept of Anti-Textons which is able to encode spatial information between Textons outperforming other representations in the context of food retrieval and Classification.
2016
Food retrieval; Food classification; Food representation
File in questo prodotto:
File Dimensione Formato  
Retrieval and classification of food images.pdf

solo gestori archivio

Tipologia: Versione Editoriale (PDF)
Dimensione 4.54 MB
Formato Adobe PDF
4.54 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/19190
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 89
  • ???jsp.display-item.citation.isi??? 74
social impact