We present HERO, an artificial assistant designed to communicate with users with both natural language and images to aid them carrying out procedures in industrial contexts. Our system is composed of five modules: 1) the input module retrieves user utterances and collects raw data, such as text and images, 2) the Natural Language Processing module processes text from user utterances, 3) the object detector module extracts entities by analyzing images captured by the user, 4) the Question Answering module generates responses to users’ specific questions on procedures, and 5) the output module selects the final response to give to the user. We deployed and evaluated the system in an industrial laboratory furnished with different tools and equipment for carrying out repair and test operations on electrical boards. In this setting, the HERO system allows the user to retrieve information on tools, equipment, procedures, and safety rules. Experiments on domain-specific labeled data, as well as a user study suggest that the design of our system is robust and that its use can be beneficial for users over classic methods for retrieving information and guide workers, such as printed manuals.

HERO: A Multi-modal Approach on Mobile Devices for Visual-Aware Conversational Assistance in Industrial Domains

Bonanno C.;Ragusa F.;Furnari A.;Farinella G. M.
2023-01-01

Abstract

We present HERO, an artificial assistant designed to communicate with users with both natural language and images to aid them carrying out procedures in industrial contexts. Our system is composed of five modules: 1) the input module retrieves user utterances and collects raw data, such as text and images, 2) the Natural Language Processing module processes text from user utterances, 3) the object detector module extracts entities by analyzing images captured by the user, 4) the Question Answering module generates responses to users’ specific questions on procedures, and 5) the output module selects the final response to give to the user. We deployed and evaluated the system in an industrial laboratory furnished with different tools and equipment for carrying out repair and test operations on electrical boards. In this setting, the HERO system allows the user to retrieve information on tools, equipment, procedures, and safety rules. Experiments on domain-specific labeled data, as well as a user study suggest that the design of our system is robust and that its use can be beneficial for users over classic methods for retrieving information and guide workers, such as printed manuals.
2023
978-3-031-43147-0
978-3-031-43148-7
Conversational Agents
First Person Vision
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/576432
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact