Context: Services that run in a data center can be configured to store information about their behaviour in specific logs. They contain huge amount of data that makes difficult to manually understand run-time properties of services. Therefore, to facilitate their analysis it is important to use dynamic solutions to quickly parse log files, detect anomaly and diagnose problems in order to react promptly. Objectives: We want to build a model for determining anomaly detection in a certain period of time. Furthermore, we wish to identify machine learning techniques that support us determining problems patterns according to the messages available in logs. Method: We have selected machine learning techniques, such as Invariant Mining model, natural language process and autoencoder, able to work with messages and identify anomaly patterns. According to the data available we have decided to study monthly data and detect samples with a higher frequency of problems. We have scanned the various logs to search the services with a wrong behaviour in the same period of time to recognize past anomalies in the data center and code these behaviours. Results: The results are promising. We have obtained an average of F-measure metric over 86%. Conclusion: Our model aims at quickly recognizing problems and solving them. It helps site administrators to better understand the run services, code anomalies and crosscheck different messages in the same time slot.

Identifying Anomaly Detection Patterns from Log Files: A Dynamic Approach

Cavallaro C.
Primo
;
2021-01-01

Abstract

Context: Services that run in a data center can be configured to store information about their behaviour in specific logs. They contain huge amount of data that makes difficult to manually understand run-time properties of services. Therefore, to facilitate their analysis it is important to use dynamic solutions to quickly parse log files, detect anomaly and diagnose problems in order to react promptly. Objectives: We want to build a model for determining anomaly detection in a certain period of time. Furthermore, we wish to identify machine learning techniques that support us determining problems patterns according to the messages available in logs. Method: We have selected machine learning techniques, such as Invariant Mining model, natural language process and autoencoder, able to work with messages and identify anomaly patterns. According to the data available we have decided to study monthly data and detect samples with a higher frequency of problems. We have scanned the various logs to search the services with a wrong behaviour in the same period of time to recognize past anomalies in the data center and code these behaviours. Results: The results are promising. We have obtained an average of F-measure metric over 86%. Conclusion: Our model aims at quickly recognizing problems and solving them. It helps site administrators to better understand the run services, code anomalies and crosscheck different messages in the same time slot.
2021
978-3-030-86959-5
978-3-030-86960-1
Anomaly detection
Log analysis
Log mining
Log parsing
Machine learning
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/540665
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 4
social impact