Exploring the relationships between data complexity and classification diversity in ensembles

Formentín Garcia, Nathan; Tiggeman, Frederico; Borges, Eduardo N.; Lucca, Giancarlo; Santos, Helida; Pereira Dimuro, Graçaliz

Exploring the relationships between data complexity and classification diversity in ensembles

Files

Formentin_ExploringRelationships_1663760809294_14587.pdf (412.48 KB)

Date

2021

Authors

Formentín Garcia, Nathan

Tiggeman, Frederico

Borges, Eduardo N.

Lucca, Giancarlo

Santos, Helida

Pereira Dimuro, Graçaliz

Publisher

SciTePress

Acceso abierto / Sarbide irekia

Contribución a congreso / Biltzarrerako ekarpena

Versión publicada / Argitaratu den bertsioa

Impacto

4

No disponible en Scopus

Abstract

Several classification techniques have been proposed in the last years. Each approach is best suited for a particular classification problem, i.e., a classification algorithm may not effectively or efficiently recognize some patterns in complex data. Selecting the best-tuned solution may be prohibitive. Methods for combining classifiers have also been proposed aiming at improving the generalization ability and classification results. In this paper, we analyze geometrical features of the data class distribution and the diversity of the base classifiers to understand better the performance of an ensemble approach based on stacking. The experimental evaluation was conducted using 32 real datasets, twelve data complexity measures, five diversity measures, and five heterogeneous classification algorithms. The results show that stacked generalization outperforms the best individual base classifier when there is a combination of complex and imbalanced data with diverse predictions among weak learners.

Keywords

Machine learning ensembles, Complexity measures, Diversity measures

Department

Estadística, Informática y Matemáticas / Estatistika, Informatika eta Matematika

URI

https://academica-e.unavarra.es/handle/2454/44108

https://doi.org/10.5220/0010440006520659

item.page.cita

Garcia, N.; Tiggeman, F.; Borges, E.; Lucca, G.; Santos, H. and Dimuro, G. (2021). Exploring the Relationships between Data Complexity and Classification Diversity in Ensembles. In Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-509-8; ISSN 2184-4992, pages 652-659. DOI: 10.5220/0010440006520659

item.page.rights

Collections

Comunicaciones y ponencias de congresos DEIM - EIMS Biltzarretako komunikazioak eta txostenak
Comunicaciones y ponencias de congresos - Biltzarrak eta Argitalpenak

Full item page

Exploring the relationships between data complexity and classification diversity in ensembles

Files

Date

Authors

Director

Publisher

Project identifier

Impacto

Abstract

Description

Keywords

Department

Faculty/School

Degree

Doctorate program

URI

item.page.cita

item.page.rights

Collections