scmamp: statistical comparison of multiple algorithms in multiple problems

View/ Open
Date
2016Version
Acceso abierto / Sarbide irekia
Type
Artículo / Artikulua
Version
Versión publicada / Argitaratu den bertsioa
Impact
|
nodoi-noplumx
|
Abstract
Comparing the results obtained by two or more algorithms in a set of problems is a central
task in areas such as machine learning or optimization. Drawing conclusions from these comparisons
may require the use of statistical tools such as hypothesis testing. There are some interesting papers
that cover this topic. In this manuscript we present scmamp, an R package aimed at being a tool
that s ...
[++]
Comparing the results obtained by two or more algorithms in a set of problems is a central
task in areas such as machine learning or optimization. Drawing conclusions from these comparisons
may require the use of statistical tools such as hypothesis testing. There are some interesting papers
that cover this topic. In this manuscript we present scmamp, an R package aimed at being a tool
that simplifies the whole process of analyzing the results obtained when comparing algorithms, from
loading the data to the production of plots and tables.
Comparing the performance of different algorithms is an essential step in many research and
practical computational works. When new algorithms are proposed, they have to be compared with
the state of the art. Similarly, when an algorithm is used for a particular problem, its performance with
different sets of parameters has to be compared, in order to tune them for the best results.
When the differences are very clear (e.g., when an algorithm is the best in all the problems used in
the comparison), the direct comparison of the results may be enough. However, this is an unusual
situation and, thus, in most situations a direct comparison may be misleading and not enough to draw
sound conclusions; in those cases, the statistical assessment of the results is advisable.
The statistical comparison of algorithms in the context of machine learning has been covered in
several papers. In particular, the tools implemented in this package are those presented in Demšar
(2006); García and Herrera (2008); García et al. (2010). Another good review that covers, among other
aspects, the statistical assessment of the results in the context of supervised classification can be found
in Santafé et al. (2015). [--]
Publisher
The R Foundation
Published in
The R Journal, Vol. 8/1, Aug. 2016
Departament
Universidad Pública de Navarra. Departamento de Estadística e Investigación Operativa /
Nafarroako Unibertsitate Publikoa. Estatistika eta Ikerketa Operatiboa Saila