Market basket analysis from egocentric videos

IRIS

This paper presents Visual Market Basket Analysis (VMBA), a novel application domain for egocentric vision systems. The final goal of VMBA is to infer the behavior of the customers of a store during their shopping. The analysis relies on image sequences acquired by cameras mounted on shopping carts. The inferred behaviors can be coupled with classic Market Basket Analysis information (i.e., receipts) to help retailers to improve the management of spaces and marketing strategies. To set up the challenge, we collected a new dataset of egocentric videos during real shopping sessions in a retail store. Video frames have been labeled according to a proposed hierarchy of 14 different customer behaviors from the beginning (cart picking) to the end (cart releasing) of their shopping. We benchmark different representation and classification techniques and propose a multi-modal method which exploits visual, motion and audio descriptors to perform classification with the Directed Acyclic Graph SVM learning architecture. Experiments highlight that employing multimodal representations and explicitly addressing the task in a hierarchical way is beneficial. The devised approach based on Deep Features achieves an accuracy of more than 87% over the 14 classes of the considered dataset.

Market basket analysis from egocentric videos

SANTARCANGELO, VITO;Farinella, Giovanni Maria;Furnari, Antonino;Battiato, Sebastiano

2018-01-01

Abstract

This paper presents Visual Market Basket Analysis (VMBA), a novel application domain for egocentric vision systems. The final goal of VMBA is to infer the behavior of the customers of a store during their shopping. The analysis relies on image sequences acquired by cameras mounted on shopping carts. The inferred behaviors can be coupled with classic Market Basket Analysis information (i.e., receipts) to help retailers to improve the management of spaces and marketing strategies. To set up the challenge, we collected a new dataset of egocentric videos during real shopping sessions in a retail store. Video frames have been labeled according to a proposed hierarchy of 14 different customer behaviors from the beginning (cart picking) to the end (cart releasing) of their shopping. We benchmark different representation and classification techniques and propose a multi-modal method which exploits visual, motion and audio descriptors to perform classification with the Directed Acyclic Graph SVM learning architecture. Experiments highlight that employing multimodal representations and explicitly addressing the task in a hierarchical way is beneficial. The devised approach based on Deep Features achieves an accuracy of more than 87% over the 14 classes of the considered dataset.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2018
			
	Parole chiave
	
				Egocentric vision; Market basket analysis; Multimodal analysis; Software; Signal Processing; 1707; Artificial Intelligence
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Market basket analysis from egocentric videos.pdf solo gestori archivio Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 1.81 MB Formato Adobe PDF Visualizza/Apri	1.81 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/335579

Citazioni

ND

13

8

social impact