Open Access
CFM-BD: a distributed rule induction algorithm for building compact fuzzy models in Big Data classification problems
(IEEE, 2020) Elkano Ilintxeta, Mikel; Sanz Delgado, José Antonio; Barrenechea Tartas, Edurne; Bustince Sola, Humberto; Galar Idoate, Mikel; Estatistika, Informatika eta Matematika; Institute of Smart Cities - ISC; Estadística, Informática y Matemáticas
Interpretability has always been a major concern for fuzzy rule-based classifiers. The usage of human-readable models allows them to explain the reasoning behind their predictions and decisions. However, when it comes to Big Data classification problems, fuzzy rule based classifiers have not been able to maintain the good tradeoff between accuracy and interpretability that has characterized these techniques in non-Big-Data environments. The most accurate methods build models composed of a large number of rules and fuzzy sets that are too complex, while those approaches focusing on interpretability do not provide state-of-the-art discrimination capabilities. In this paper, we propose a new distributed learning algorithm named CFM-BD to construct accurate and compact fuzzy rule-based classification systems for Big Data. This method has been specifically designed from scratch for Big Data problems and does not adapt or extend any existing algorithm. The proposed learning process consists of three stages: Preprocessing based on the probability integral transform theorem; rule induction inspired by CHI-BD and Apriori algorithms; and rule selection by means of a global evolutionary optimization. We conducted a complete empirical study to test the performance of our approach in terms of accuracy, complexity, and runtime. The results obtained were compared and contrasted with four state-of-the-art fuzzy classifiers for Big Data (FBDT, FMDT, Chi-Spark-RS, and CHI-BD). According to this study, CFM-BD is able to provide competitive discrimination capabilities using significantly simpler models composed of a few rules of less than three antecedents, employing five linguistic labels for all variables.
Open Access
An evolutionary underbagging approach to tackle the survival prediction of trauma patients: a case study at the Hospital of Navarre
(IEEE, 2019) Sanz Delgado, José Antonio; Galar Idoate, Mikel; Bustince Sola, Humberto; Belzunegui Otano, Tomás; Estatistika, Informatika eta Matematika; Institute of Smart Cities - ISC; Estadística, Informática y Matemáticas; Gobierno de Navarra / Nafarroako Gobernua, PI-019/11
Survival prediction systems are used among emergency services at hospitals in order to measure their quality objectively. In order to do so, the estimated mortality rate given by a prediction model is compared with the real rate of the hospital. Hence, the accuracy of the prediction system is a key factor as more reliable estimations can be obtained. Survival prediction systems are aimed at scoring the severity of patients' injuries. Afterward, this score is used to estimate whether the patient will survive or not. Luckily, the number of patients who survive their injuries is greater than that of those who die. However, this degree of imbalance implies a greater difficulty in learning the prediction models. The aim of this paper is to develop a new prediction system for the Hospital of Navarre with the goal of improving the prediction capabilities of the currently used models since it would imply having a more reliable measurement of its quality. In order to do so, we propose a new strategy to conform an ensemble of classifiers using an evolutionary under sampling process in the bagging methodology. The experimental study is carried out over 462 patients who were treated at the Hospital of Navarre. Our new ensemble approach is an appropriate tool to deal with this problem as it is able to outperform the currently used models by the staff of the hospital as well as several state-of-the-art ensemble approaches designed for imbalanced domains.
Open Access
A compact evolutionary interval-valued fuzzy rule-based classification system for the modeling and prediction of real-world financial applications with imbalanced data
(IEEE, 2014) Sanz Delgado, José Antonio; Bernardo, Darío; Herrera, Francisco; Bustince Sola, Humberto; Hagras, Hani; Automática y Computación; Automatika eta Konputazioa
The current financial crisis has stressed the need of obtaining more accurate prediction models in order to decrease the risk when investing money on economic opportunities. In addition, the transparency of the process followed to make the decisions in financial applications is becoming an important issue. Furthermore, there is a need to handle the real-world imbalanced financial data sets without using sampling techniques which might introduce noise in the used data. In this paper, we present a compact evolutionary interval-valued fuzzy rule-based classification system, which is based on IVTURSFARC-HD (Interval-Valued fuzzy rule-based classification system with TUning and Rule Selection) [22]), for the modeling and prediction of real-world financial applications. This proposed system allows obtaining good predictions accuracies using a small set of short fuzzy rules implying a high degree of interpretability of the generated linguistic model. Furthermore, the proposed system deals with the financial imbalanced datasets with no need for any preprocessing or sampling method and thus avoiding the accidental introduction of noise in the data used in the learning process. The system is also provided with a mechanism to handle examples that are not covered by any fuzzy rule in the generated rule base. To test the quality of our proposal, we will present an experimental study including eleven real-world financial datasets. We will show that the proposed system outperforms the original C4.5 decision tree, type-1 and interval-valued fuzzy counterparts which use the SMOTE sampling technique to preprocess data and the original FURIA, which is a fuzzy approximative classifier. Furthermore, the proposed method enhances the results achieved by the cost sensitive C4.5 and it gives competitive results when compared with FURIA using SMOTE, while our proposal avoids pre-processing techniques and it provides interpretable models that allow obtaining more accurate results.

Sanz Delgado, José Antonio

Email Address

person.page.identifierURI

Birth Date

Job Title

Last Name

First Name

person.page.departamento

person.page.instituteName

ORCID

person.page.observainves

person.page.upna

Name

Filters

Author

Subject

Date

Has files

Item Type

Type

Settings

Sort By

Results per page

Search Results