The food recognition project (FoodRec) aims to define an automatic framework using computer vision and deep learning techniques to recognize diverse foods from images. The goal of food recognition is to extract and infer semantic information from the food images and to classify different foods present in the image. The developed system acquires images of the food eaten by the user or subject over time, which will then be processed by food recognition algorithms to extract and infer semantic information from the images containing food. The extracted information will be exploited to track and monitor the dietary habits of people involved in a smoke-quitting protocol. Food recognition is an active research area due to its wide range of potential real-world applications. For example, it would allow people to track their food intake of what they consume by simply taking a picture, to increase the awareness in their daily diet by monitoring their food habits, the kind and amount of taken food, how much time the user spends eating during the day, how many and what times the user has a meal, analysis on user’s habits changes, bad habits and other inferences related to user’s behavior. It can help a doctor to have a better opinion with respect to the patient’s behavior, quitting treatment response, and hence his health needs. This project involves several stages from image acquisition to recognition. In particular, the efforts are devoted to the development of new segmentation and recognition algorithms to perform the food recognition task accurately. This PhD thesis presents semantic food segmentation and recognition using deep learning techniques. The proposed approaches have been developed in the context of the FoodRec project, which aims to define an automatic framework for the monitoring of people's health and habits, during their smoke-quitting program. The aim is to extract and infer semantic information from the food images to analyze diverse foods present in the image. We introduce a new FoodRec-50 dataset with 50 food categories collected by the iOS and Android smartphone applications, taken by 164 users during their smoking cessation therapy. Data preprocessing, data annotations, and data augmentation with different transformations are performed for further processing after the data has been collected by the application. For food recognition, we propose a Deep Convolutional Neural Network able to recognize food items of specific users and monitor their habits. It consists of a food branch to learn visual representation for the input food items and a user branch to take into account the specific user's eating habits. Experimental results show that the proposed food recognition method outperforms the baseline model results on the FoodRec-50 dataset. We also performed an ablation study, which demonstrated that the proposed architecture is able to tune the prediction based on the users' eating habits. For food segmentation, we propose a novel Convolutional Deconvolutional Pyramid Network (CDPN) for food segmentation to understand the semantic information of an image at a pixel-level. This network employs convolution and deconvolution layers to build a feature pyramid and achieves high-level semantic feature map representation. As a consequence, the novel semantic segmentation network generates a dense and precise segmentation map of the input food image. Furthermore, the proposed method demonstrated significant improvements on two well-known public benchmark food segmentation datasets. We propose another Food Convolutional Deconvolutional Network (FCDN) for semantic segmentation to extract and infer semantic information from the food images at a pixel-level to recognize different food items present in an image. The proposed FCDN employs only learnable features upsampling using deconvolution layers to increase the spatial resolution of the feature maps and to learn the complex patterns, while the proposed CDPN also uses interpolation for features upsampling along with the deconvolution layers. Our proposed network demonstrated significant improvements in the results on the benchmark food dataset as compared to the state-of-the-art methods. Additionally, we also conducted a cross-data qualitative analysis of our proposed segmentation method to assess its generalization capabilities on our FoodRec dataset. The research outcomes of the food recognition include 2 journals and 3 conference papers. This project is a research grant where I collaborated to develop food recognition algorithms. This research was sponsored by ECLAT srl, a spin-off of the University of Catania, with the help of a grant from the Foundation for a Smoke-Free World Inc., a US nonprofit 501(c)(3) private foundation with a mission to end smoking in this generation. My contributions in this project include data preprocessing, data annotations, and data augmentation for further processing after the data has been collected by the application. Then, to study, develop, and evaluate algorithms using computer vision and deep learning techniques to track and monitor the dietary habits of people. The additional research work is collaborated with the Department of Drug and Health Science, University of Catania. The aim of the work was to find a correlation between well-defined and selected parameters such as the type of nanocarrier, the particle size and the surface charge, and the targeting efficiency indexes DTE and DTP. We performed nose-to-brain drug delivery data cleaning, conversion, standardization, and classification using state-of-the-art machine learning algorithms. We are working to publish a journal from this work with the collaboration of the Department of Drug and Health Science.
Il progetto di riconoscimento alimentare (FoodRec) mira a definire un quadro automatico utilizzando tecniche di visione artificiale e deep learning per riconoscere diversi alimenti dalle immagini. L'obiettivo del riconoscimento degli alimenti è estrarre e dedurre informazioni semantiche dalle immagini degli alimenti e classificare i diversi alimenti presenti nell'immagine. Il sistema sviluppato acquisisce immagini del cibo consumato dall'utente o dal soggetto nel tempo, che verranno poi elaborate da algoritmi di riconoscimento alimentare per estrarre e dedurre informazioni semantiche dalle immagini contenenti cibo. Le informazioni estratte verranno sfruttate per tracciare e monitorare le abitudini alimentari delle persone coinvolte in un protocollo per smettere di fumare. Il riconoscimento degli alimenti è un’area di ricerca attiva grazie alla sua vasta gamma di potenziali applicazioni nel mondo reale. Ad esempio, consentirebbe alle persone di monitorare l'assunzione di cibo o ciò che consumano semplicemente scattando una foto, di aumentare la consapevolezza nella loro dieta quotidiana monitorando le loro abitudini alimentari, il tipo e la quantità di cibo assunto, quanto tempo l'utente trascorre mangiare durante la giornata, quanti e a che ora l'utente consuma un pasto, analisi sui cambiamenti delle abitudini dell'utente, cattive abitudini e altre inferenze relative al comportamento dell'utente. Può aiutare un medico ad avere un’opinione migliore rispetto al comportamento del paziente, all’abbandono della risposta al trattamento e quindi ai suoi bisogni di salute. Questo progetto prevede diverse fasi dall'acquisizione dell'immagine al riconoscimento. In particolare, gli sforzi sono dedicati allo sviluppo di nuovi algoritmi di segmentazione e riconoscimento per eseguire in modo accurato il compito di riconoscimento degli alimenti. Questa tesi di dottorato presenta la segmentazione e il riconoscimento semantico del cibo utilizzando tecniche di deep learning. Gli approcci proposti sono stati sviluppati nel contesto del progetto FoodRec, che mira a definire un quadro automatico per il monitoraggio della salute e delle abitudini delle persone, durante il loro programma per smettere di fumare. Lo scopo è quello di estrarre e dedurre informazioni semantiche dalle immagini degli alimenti per analizzare i diversi alimenti presenti nell'immagine. Per la segmentazione del cibo, proponiamo una nuova rete convoluzionale e deconvoluzionale della piramide (CDPN) per la segmentazione del cibo per comprendere le informazioni semantiche di un'immagine a livello di pixel. Questa rete utilizza livelli di convoluzione e deconvoluzione per costruire una piramide di caratteristiche e ottenere una rappresentazione della mappa di caratteristiche semantiche di alto livello. Di conseguenza, la nuova rete di segmentazione semantica genera una mappa di segmentazione densa e precisa dell’immagine del cibo in input. Inoltre, il metodo proposto ha dimostrato miglioramenti significativi su due noti set di dati di segmentazione alimentare di riferimento pubblico. Proponiamo un'altra rete deconvoluzionale convoluzionale alimentare (FCDN) per la segmentazione semantica per estrarre e dedurre informazioni semantiche dalle immagini alimentari a livello di pixel per riconoscere diversi prodotti alimentari presenti in un'immagine. L'FCDN proposto utilizza solo il sovracampionamento delle caratteristiche apprendibili utilizzando gli strati di deconvoluzione per aumentare la risoluzione spaziale delle mappe delle caratteristiche e per apprendere i modelli complessi, mentre il CDPN proposto utilizza anche l'interpolazione per il sovracampionamento delle caratteristiche insieme agli strati di deconvoluzione. La nostra rete proposta ha dimostrato miglioramenti significativi nei risultati sul set di dati sugli alimenti di riferimento rispetto ai metodi all’avanguardia. Inoltre, abbiamo anche condotto un'analisi qualitativa incrociata dei dati del nostro metodo di segmentazione proposto per valutare le sue capacità di generalizzazione sul nostro set di dati FoodRec. I risultati della ricerca del riconoscimento alimentare includono 2 riviste e 3 documenti di conferenze. Questo progetto è un assegno di ricerca in cui ho collaborato per sviluppare algoritmi di riconoscimento degli alimenti. Questa ricerca è stata sponsorizzata da ECLAT srl, spin-off dell'Università di Catania, con l'aiuto di un finanziamento della Foundation for a Smoke-Free World Inc., una fondazione privata statunitense senza scopo di lucro 501(c)(3) con un missione di porre fine al fumo in questa generazione. I miei contributi a questo progetto includono la preelaborazione dei dati, le annotazioni dei dati e l'aumento dei dati per un'ulteriore elaborazione dopo che i dati sono stati raccolti dall'applicazione. Quindi, studiare, sviluppare e valutare algoritmi utilizzando tecniche di visione artificiale e deep learning per tracciare e monitorare le abitudini alimentari delle persone. L'ulteriore lavoro di ricerca è svolto in collaborazione con il Dipartimento di Scienze del Farmaco e della Salute dell'Università di Catania. Lo scopo del lavoro è stato quello di trovare una correlazione tra parametri ben definiti e selezionati come il tipo di nanoportatore, la dimensione delle particelle e la carica superficiale, e gli indici di efficienza di targeting DTE e DTP. Abbiamo eseguito la pulizia, la conversione, la standardizzazione e la classificazione dei dati di somministrazione dei farmaci dal naso al cervello utilizzando algoritmi di apprendimento automatico all'avanguardia. Stiamo lavorando per pubblicare una rivista da questo lavoro con la collaborazione del Dipartimento di Scienze del Farmaco e della Salute.
Deep FoodRec: tecnologia di riconoscimento alimentare che utilizza la visione artificiale e il deep learning per il monitoraggio della salute / Hussain, Mazhar. - (2024 Feb 12).
Deep FoodRec: tecnologia di riconoscimento alimentare che utilizza la visione artificiale e il deep learning per il monitoraggio della salute
HUSSAIN, MAZHAR
2024-02-12
Abstract
The food recognition project (FoodRec) aims to define an automatic framework using computer vision and deep learning techniques to recognize diverse foods from images. The goal of food recognition is to extract and infer semantic information from the food images and to classify different foods present in the image. The developed system acquires images of the food eaten by the user or subject over time, which will then be processed by food recognition algorithms to extract and infer semantic information from the images containing food. The extracted information will be exploited to track and monitor the dietary habits of people involved in a smoke-quitting protocol. Food recognition is an active research area due to its wide range of potential real-world applications. For example, it would allow people to track their food intake of what they consume by simply taking a picture, to increase the awareness in their daily diet by monitoring their food habits, the kind and amount of taken food, how much time the user spends eating during the day, how many and what times the user has a meal, analysis on user’s habits changes, bad habits and other inferences related to user’s behavior. It can help a doctor to have a better opinion with respect to the patient’s behavior, quitting treatment response, and hence his health needs. This project involves several stages from image acquisition to recognition. In particular, the efforts are devoted to the development of new segmentation and recognition algorithms to perform the food recognition task accurately. This PhD thesis presents semantic food segmentation and recognition using deep learning techniques. The proposed approaches have been developed in the context of the FoodRec project, which aims to define an automatic framework for the monitoring of people's health and habits, during their smoke-quitting program. The aim is to extract and infer semantic information from the food images to analyze diverse foods present in the image. We introduce a new FoodRec-50 dataset with 50 food categories collected by the iOS and Android smartphone applications, taken by 164 users during their smoking cessation therapy. Data preprocessing, data annotations, and data augmentation with different transformations are performed for further processing after the data has been collected by the application. For food recognition, we propose a Deep Convolutional Neural Network able to recognize food items of specific users and monitor their habits. It consists of a food branch to learn visual representation for the input food items and a user branch to take into account the specific user's eating habits. Experimental results show that the proposed food recognition method outperforms the baseline model results on the FoodRec-50 dataset. We also performed an ablation study, which demonstrated that the proposed architecture is able to tune the prediction based on the users' eating habits. For food segmentation, we propose a novel Convolutional Deconvolutional Pyramid Network (CDPN) for food segmentation to understand the semantic information of an image at a pixel-level. This network employs convolution and deconvolution layers to build a feature pyramid and achieves high-level semantic feature map representation. As a consequence, the novel semantic segmentation network generates a dense and precise segmentation map of the input food image. Furthermore, the proposed method demonstrated significant improvements on two well-known public benchmark food segmentation datasets. We propose another Food Convolutional Deconvolutional Network (FCDN) for semantic segmentation to extract and infer semantic information from the food images at a pixel-level to recognize different food items present in an image. The proposed FCDN employs only learnable features upsampling using deconvolution layers to increase the spatial resolution of the feature maps and to learn the complex patterns, while the proposed CDPN also uses interpolation for features upsampling along with the deconvolution layers. Our proposed network demonstrated significant improvements in the results on the benchmark food dataset as compared to the state-of-the-art methods. Additionally, we also conducted a cross-data qualitative analysis of our proposed segmentation method to assess its generalization capabilities on our FoodRec dataset. The research outcomes of the food recognition include 2 journals and 3 conference papers. This project is a research grant where I collaborated to develop food recognition algorithms. This research was sponsored by ECLAT srl, a spin-off of the University of Catania, with the help of a grant from the Foundation for a Smoke-Free World Inc., a US nonprofit 501(c)(3) private foundation with a mission to end smoking in this generation. My contributions in this project include data preprocessing, data annotations, and data augmentation for further processing after the data has been collected by the application. Then, to study, develop, and evaluate algorithms using computer vision and deep learning techniques to track and monitor the dietary habits of people. The additional research work is collaborated with the Department of Drug and Health Science, University of Catania. The aim of the work was to find a correlation between well-defined and selected parameters such as the type of nanocarrier, the particle size and the surface charge, and the targeting efficiency indexes DTE and DTP. We performed nose-to-brain drug delivery data cleaning, conversion, standardization, and classification using state-of-the-art machine learning algorithms. We are working to publish a journal from this work with the collaboration of the Department of Drug and Health Science.File | Dimensione | Formato | |
---|---|---|---|
PhD Thesis (Mazhar Hussain).pdf
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
PUBBLICO - Pubblico con Copyright
Dimensione
24.64 MB
Formato
Adobe PDF
|
24.64 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.