In this paper we describe, test, and compare the performance of a number of techniques used for outlier detection to improve modeling capabilities of soft sensors on the basis of the quality of available data. We analyze methods based on standard deviation of population, on residuals of a linear input-output regression, on the structure correlation of the data, on principal components and partial least squares (both linear and nonlinear) in multi dimensional space (2D, 3D, 4D), on Q and T2 statistics, on the distance of each observation from the mean of the data, and on the Mahalanobis distance. We apply techniques for outlier detection both on a fictitious model data and on real data acquired from a sulfur recovery unit of a refinery. We show that outlier removal almost always improves modeling capabilities of considered techniques.
A Comparative Analysis of the Influence of Methods for Outliers Detection on the Performance of Data Driven Models
FORTUNA, Luigi;GRAZIANI, Salvatore;
2007-01-01
Abstract
In this paper we describe, test, and compare the performance of a number of techniques used for outlier detection to improve modeling capabilities of soft sensors on the basis of the quality of available data. We analyze methods based on standard deviation of population, on residuals of a linear input-output regression, on the structure correlation of the data, on principal components and partial least squares (both linear and nonlinear) in multi dimensional space (2D, 3D, 4D), on Q and T2 statistics, on the distance of each observation from the mean of the data, and on the Mahalanobis distance. We apply techniques for outlier detection both on a fictitious model data and on real data acquired from a sulfur recovery unit of a refinery. We show that outlier removal almost always improves modeling capabilities of considered techniques.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.