Generalization of classification rules is a fundamental issue in automatic pattern recognition. Overfitting a classifier on the training data is a well known problem and it has been the focus of a lot of research in the recent decade. Fuzzy techniques naturally provide soft representation of functions that could be adapted to address some of the overfitting/generalization dilemma. In the literature there are a lot of papers concerning fuzzy theory as a mean for classifying and extracting information from a huge amount of data in a human-like fashion. Many authors have studied how to obtain a membership function of a fuzzy set by ad hoc heuristics, histograms, nearest-neighbor, etc. In Ding, a definition of fuzzy likelihood measure was proposed in the similarity estimation context, while cite{Osoba} puts the basis of adaptive fuzzy likelihood algorithms in the context of system theory and fuzzy logic.In this paper we propose a new approach to supervised classification based on a novel proposal for a fuzzy likelihood function. This new function leads to a fuzzy version of Bayes Rules for Maximum a Posteriori classification(MAP). The performances of the proposed new method are close to the performances of classical methods and the new technique provides several advantages. Classification can be done using a confidence threshold set by the user; moreover an automatic criterion to signal cases when classification cannot be safely done is intrinsecally provided by our approach.Starting from the histograms of the observed data, we provide a simple way to obtain the membership function of a fuzzy set approximating the data distribution. This is obtained combining together the raw data histograms with their successively smoothed versions. A posterior probability is, in turn, obtained through a suitable fuzzy version of the Bayesian formula. It is important to note that, since our likelihoods are fuzzy numbers, a careful translation in terms of "restricted fuzzy arithmetic" has to be done for the classical Bayes rule in order to obtain meaningful probabilities. To classify a member in a set we adopt the "overtaking" relation between fuzzy numbers introduced in Anile. The overtaking mimics an ordering relation between fuzzy numbers that depends on an assigned threshold value. The ordering imposed by the overtaking relation translates immediately into a dominance of the posterior probability of a class over another for a given observed value. In this way a crisp classification is eventually obtained.The proposed method has been tested on some standard data sets and the results are reported below.The authors have implemented the proposed ideas in Matlab and performed classification over some standard benchmarks. In all cases the results have been close to the theoretical optimal error rate.
Pattern Classification through Fuzzy Likelihood
PIDATELLA, Rosa Maria;GALLO, Giovanni;
2013-01-01
Abstract
Generalization of classification rules is a fundamental issue in automatic pattern recognition. Overfitting a classifier on the training data is a well known problem and it has been the focus of a lot of research in the recent decade. Fuzzy techniques naturally provide soft representation of functions that could be adapted to address some of the overfitting/generalization dilemma. In the literature there are a lot of papers concerning fuzzy theory as a mean for classifying and extracting information from a huge amount of data in a human-like fashion. Many authors have studied how to obtain a membership function of a fuzzy set by ad hoc heuristics, histograms, nearest-neighbor, etc. In Ding, a definition of fuzzy likelihood measure was proposed in the similarity estimation context, while cite{Osoba} puts the basis of adaptive fuzzy likelihood algorithms in the context of system theory and fuzzy logic.In this paper we propose a new approach to supervised classification based on a novel proposal for a fuzzy likelihood function. This new function leads to a fuzzy version of Bayes Rules for Maximum a Posteriori classification(MAP). The performances of the proposed new method are close to the performances of classical methods and the new technique provides several advantages. Classification can be done using a confidence threshold set by the user; moreover an automatic criterion to signal cases when classification cannot be safely done is intrinsecally provided by our approach.Starting from the histograms of the observed data, we provide a simple way to obtain the membership function of a fuzzy set approximating the data distribution. This is obtained combining together the raw data histograms with their successively smoothed versions. A posterior probability is, in turn, obtained through a suitable fuzzy version of the Bayesian formula. It is important to note that, since our likelihoods are fuzzy numbers, a careful translation in terms of "restricted fuzzy arithmetic" has to be done for the classical Bayes rule in order to obtain meaningful probabilities. To classify a member in a set we adopt the "overtaking" relation between fuzzy numbers introduced in Anile. The overtaking mimics an ordering relation between fuzzy numbers that depends on an assigned threshold value. The ordering imposed by the overtaking relation translates immediately into a dominance of the posterior probability of a class over another for a given observed value. In this way a crisp classification is eventually obtained.The proposed method has been tested on some standard data sets and the results are reported below.The authors have implemented the proposed ideas in Matlab and performed classification over some standard benchmarks. In all cases the results have been close to the theoretical optimal error rate.File | Dimensione | Formato | |
---|---|---|---|
PGZ.pdf
accesso aperto
Tipologia:
Versione Editoriale (PDF)
Licenza:
Non specificato
Dimensione
321.96 kB
Formato
Adobe PDF
|
321.96 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.