This paper presents a system designed to handle real-time streaming data, addressing the critical challenge of handling out-of-order data. In real-time applications, data may arrive outside the expected time sequence for various reasons, which can lead to inaccuracies in analysis if not handled properly. The system is able to identify such sequence anomalies in the data and accurately re-insert it into the time window it belongs to, this ensures that all calculations, including those previously performed, are re-run with the updated data to maintain accuracy. The system uses time windows of predefined duration, allowing specific operations on the incoming data, such as averaging parameters over 10-minute intervals. This approach allows for reliable results even in the presence of delays, interruptions or distortions in the chronological order of the data, thus ensuring that each data is correctly analyzed, the system improves both the accuracy and reliability of real-time processing. The results demonstrate how effective handling of out-of-order data significantly improves the overall efficiency and accuracy of data analysis, making it a valuable solution for a wide range of real-time data processing applications. Finally, the application of the proposed solution to data quality control at Bax Energy is discussed.
Optimizing IoT Data Streams with Tumbling Window Techniques
Carchiolo V.;Malgeri M.;
2025-01-01
Abstract
This paper presents a system designed to handle real-time streaming data, addressing the critical challenge of handling out-of-order data. In real-time applications, data may arrive outside the expected time sequence for various reasons, which can lead to inaccuracies in analysis if not handled properly. The system is able to identify such sequence anomalies in the data and accurately re-insert it into the time window it belongs to, this ensures that all calculations, including those previously performed, are re-run with the updated data to maintain accuracy. The system uses time windows of predefined duration, allowing specific operations on the incoming data, such as averaging parameters over 10-minute intervals. This approach allows for reliable results even in the presence of delays, interruptions or distortions in the chronological order of the data, thus ensuring that each data is correctly analyzed, the system improves both the accuracy and reliability of real-time processing. The results demonstrate how effective handling of out-of-order data significantly improves the overall efficiency and accuracy of data analysis, making it a valuable solution for a wide range of real-time data processing applications. Finally, the application of the proposed solution to data quality control at Bax Energy is discussed.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


