Open Access
A survey of fingerprint classification Part II: experimental analysis and ensemble proposal
(Elsevier, 2015) Galar Idoate, Mikel; Derrac, Joaquín; Peralta, Daniel; Triguero, Isaac; Paternain Dallo, Daniel; López Molina, Carlos; García, Salvador; Benítez, José Manuel; Pagola Barrio, Miguel; Barrenechea Tartas, Edurne; Bustince Sola, Humberto; Herrera, Francisco; Automática y Computación; Automatika eta Konputazioa
In the first part of this paper we reviewed the fingerprint classification literature from two different perspectives: the feature extraction and the classifier learning. Aiming at answering the question of which among the reviewed methods would perform better in a real implementation we end up in a discussion which showed the difficulty in answering this question. No previous comparison exists in the literature and comparisons among papers are done with different experimental frameworks. Moreover, the difficulty in implementing published methods was stated due to the lack of details in their description, parameters and the fact that no source code is shared. For this reason, in this paper we will go through a deep experimental study following the proposed double perspective. In order to do so, we have carefully implemented some of the most relevant feature extraction methods according to the explanations found in the corresponding papers and we have tested their performance with different classifiers, including those specific proposals made by the authors. Our aim is to develop an objective experimental study in a common framework, which has not been done before and which can serve as a baseline for future works on the topic. This way, we will not only test their quality, but their reusability by other researchers and will be able to indicate which proposals could be considered for future developments. Furthermore, we will show that combining different feature extraction models in an ensemble can lead to a superior performance, significantly increasing the results obtained by individual models.
Open Access
An empirical study on supervised and unsupervised fuzzy measure construction methods in highly imbalanced classification
(IEEE, 2020) Uriz Martín, Mikel Xabier; Paternain Dallo, Daniel; Bustince Sola, Humberto; Galar Idoate, Mikel; Estatistika, Informatika eta Matematika; Institute of Smart Cities - ISC; Estadística, Informática y Matemáticas; Universidad Pública de Navarra / Nafarroako Unibertsitate Publikoa
The design of an ensemble of classifiers involves the definition of an aggregation mechanism that produces a single response obtained from the information provided by the classifiers. A specific aggregation methodology that has been studied in the literature is the use of fuzzy integrals, such as the Choquet or the Sugeno integral, where the associated fuzzy measure tries to represent the interaction existing between the classifiers of the ensemble. However, defining the big number of coefficients of a fuzzy measure is not a trivial task and therefore, many different algorithms have been proposed. These can be split into supervised and unsupervised, each class having different learning mechanisms and particularities. Since there is no clear knowledge about the correct method to be used, in this work we propose an experimental study for comparing the performance of eight different learning algorithms under the same framework of imbalanced dataset. Moreover, we also compare the specific fuzzy integral (Choquet or Sugeno) and their synergies with the different fuzzy measure construction methods.
Open Access
A supervised fuzzy measure learning algorithm for combining classifiers
(Elsevier, 2023) Uriz Martín, Mikel Xabier; Paternain Dallo, Daniel; Bustince Sola, Humberto; Galar Idoate, Mikel; Institute of Smart Cities - ISC; Universidad Pública de Navarra / Nafarroako Unibertsitate Publikoa
Fuzzy measure-based aggregations allow taking interactions among coalitions of the input sources into account. Their main drawback when applying them in real-world problems, such as combining classifier ensembles, is how to define the fuzzy measure that governs the aggregation and specifies the interactions. However, their usage for combining classifiers has shown its advantage. The learning of the fuzzy measure can be done either in a supervised or unsupervised manner. This paper focuses on supervised approaches. Existing supervised approaches are designed to minimize the mean squared error cost function, even for classification problems. We propose a new fuzzy measure learning algorithm for combining classifiers that can optimize any cost function. To do so, advancements from deep learning frameworks are considered such as automatic gradient computation. Therefore, a gradient-based method is presented together with three new update policies that are required to preserve the monotonicity constraints of the fuzzy measures. The usefulness of the proposal and the optimization of cross-entropy cost are shown in an extensive experimental study with 58 datasets corresponding to both binary and multi-class classification problems. In this framework, the proposed method is compared with other state-of-the-art methods for fuzzy measure learning.
Open Access
A survey of fingerprint classification Part I: taxonomies on feature extraction methods and learning models
(Elsevier, 2015) Galar Idoate, Mikel; Derrac, Joaquín; Peralta, Daniel; Triguero, Isaac; Paternain Dallo, Daniel; López Molina, Carlos; García, Salvador; Benítez, José Manuel; Pagola Barrio, Miguel; Barrenechea Tartas, Edurne; Bustince Sola, Humberto; Herrera, Francisco; Automática y Computación; Automatika eta Konputazioa
This paper reviews the fingerprint classification literature looking at the problem from a double perspective. We first deal with feature extraction methods, including the different models considered for singular point detection and for orientation map extraction. Then, we focus on the different learning models considered to build the classifiers used to label new fingerprints. Taxonomies and classifications for the feature extraction, singular point detection, orientation extraction and learning methods are presented. A critical view of the existing literature have led us to present a discussion on the existing methods and their drawbacks such as difficulty in their reimplementation, lack of details or major differences in their evaluations procedures. On this account, an experimental analysis of the most relevant methods is carried out in the second part of this paper, and a new method based on their combination is presented.
Open Access
An evolutionary underbagging approach to tackle the survival prediction of trauma patients: a case study at the Hospital of Navarre
(IEEE, 2019) Sanz Delgado, José Antonio; Galar Idoate, Mikel; Bustince Sola, Humberto; Belzunegui Otano, Tomás; Estatistika, Informatika eta Matematika; Institute of Smart Cities - ISC; Estadística, Informática y Matemáticas; Gobierno de Navarra / Nafarroako Gobernua, PI-019/11
Survival prediction systems are used among emergency services at hospitals in order to measure their quality objectively. In order to do so, the estimated mortality rate given by a prediction model is compared with the real rate of the hospital. Hence, the accuracy of the prediction system is a key factor as more reliable estimations can be obtained. Survival prediction systems are aimed at scoring the severity of patients' injuries. Afterward, this score is used to estimate whether the patient will survive or not. Luckily, the number of patients who survive their injuries is greater than that of those who die. However, this degree of imbalance implies a greater difficulty in learning the prediction models. The aim of this paper is to develop a new prediction system for the Hospital of Navarre with the goal of improving the prediction capabilities of the currently used models since it would imply having a more reliable measurement of its quality. In order to do so, we propose a new strategy to conform an ensemble of classifiers using an evolutionary under sampling process in the bagging methodology. The experimental study is carried out over 462 patients who were treated at the Hospital of Navarre. Our new ensemble approach is an appropriate tool to deal with this problem as it is able to outperform the currently used models by the staff of the hospital as well as several state-of-the-art ensemble approaches designed for imbalanced domains.
Open Access
Unsupervised fuzzy measure learning for classifier ensembles from coalitions performance
(IEEE, 2020) Uriz Martín, Mikel Xabier; Paternain Dallo, Daniel; Domínguez Catena, Iris; Bustince Sola, Humberto; Galar Idoate, Mikel; Institute of Smart Cities - ISC; Universidad Pública de Navarra / Nafarroako Unibertsitate Publikoa, PJUPNA13
In Machine Learning an ensemble refers to the combination of several classifiers with the objective of improving the performance of every one of its counterparts. To design an ensemble two main aspects must be considered: how to create a diverse set of classifiers and how to combine their outputs. This work focuses on the latter task. More specifically, we focus on the usage of aggregation functions based on fuzzy measures, such as the Sugeno and Choquet integrals, since they allow to model the coalitions and interactions among the members of the ensemble. In this scenario the challenge is how to construct a fuzzy measure that models the relations among the members of the ensemble. We focus on unsupervised methods for fuzzy measure construction, review existing alternatives and categorize them depending on their features. Furthermore, we intend to address the weaknesses of previous alternatives by proposing a new construction method that obtains the fuzzy measure directly evaluating the performance of each possible subset of classifiers, which can be efficiently computed. To test the usefulness of the proposed fuzzy measure, we focus on the application of ensembles for imbalanced datasets. We consider a set of 66 imbalanced datasets and develop a complete experimental study comparing the reviewed methods and our proposal.
Open Access
INFFC: an iterative class noise filter based on the fusion of classifiers with noise sensitivity control
(Elsevier, 2015) Sáez, José Antonio; Galar Idoate, Mikel; Luengo, Julián; Herrera, Francisco; Automática y Computación; Automatika eta Konputazioa
In classification, noise may deteriorate the system performance and increase the complexity of the models built. In order to mitigate its consequences, several approaches have been proposed in the literature. Among them, noise filtering, which removes noisy examples from the training data, is one of the most used techniques. This paper proposes a new noise filtering method that combines several filtering strategies in order to increase the accuracy of the classification algorithms used after the filtering process. The filtering is based on the fusion of the predictions of several classifiers used to detect the presence of noise. We translate the idea behind multiple classifier systems, where the information gathered from different models is combined, to noise filtering. In this way, we consider the combination of classifiers instead of using only one to detect noise. Additionally, the proposed method follows an iterative noise filtering scheme that allows us to avoid the usage of detected noisy examples in each new iteration of the filtering process. Finally, we introduce a noisy score to control the filtering sensitivity, in such a way that the amount of noisy examples removed in each iteration can be adapted to the necessities of the practitioner. The first two strategies (use of multiple classifiers and iterative filtering) are used to improve the filtering accuracy, whereas the last one (the noisy score) controls the level of conservation of the filter removing potentially noisy examples. The validity of the proposed method is studied in an exhaustive experimental study. We compare the new filtering method against several state-of-the-art methods to deal with datasets with class noise and study their efficacy in three classifiers with different sensitivity to noise.