Generalizing max pooling via (a, b)-grouping functions for convolutional neural networks
Fecha
2023Autor
Versión
Acceso abierto / Sarbide irekia
Tipo
Artículo / Artikulua
Versión
Versión publicada / Argitaratu den bertsioa
Identificador del proyecto
Impacto
|
10.1016/j.inffus.2023.101893
Resumen
Due to their high adaptability to varied settings and effective optimization algorithm, Convolutional Neural
Networks (CNNs) have set the state-of-the-art on image processing jobs for the previous decade. CNNs work in
a sequential fashion, alternating between extracting significant features from an input image and aggregating
these features locally through ‘‘pooling" functions, in order to pro ...
[++]
Due to their high adaptability to varied settings and effective optimization algorithm, Convolutional Neural
Networks (CNNs) have set the state-of-the-art on image processing jobs for the previous decade. CNNs work in
a sequential fashion, alternating between extracting significant features from an input image and aggregating
these features locally through ‘‘pooling" functions, in order to produce a more compact representation.
Functions like the arithmetic mean or, more typically, the maximum are commonly used to perform
this downsampling operation. Despite the fact that many studies have been devoted to the development of
alternative pooling algorithms, in practice, ‘‘max-pooling" still equals or exceeds most of these possibilities,
and has become the standard for CNN construction.
In this paper we focus on the properties that make the maximum such an efficient solution in the context
of CNN feature downsampling and propose its replacement by grouping functions, a family of functions that
share those desirable properties. In order to adapt these functions to the context of CNNs, we present (𝑎��, 𝑏��)-
grouping functions, an extension of grouping functions to work with real valued data. We present different
construction methods for (𝑎, 𝑏)-grouping functions, and demonstrate their empirical applicability for replacing
max-pooling by using them to replace the pooling function of many well-known CNN architectures, finding
promising results. [--]
Materias
Convolutional neural networks,
Grouping functions,
Pooling functions,
Image classification
Editor
Elsevier
Publicado en
Information Fusion 99 (2023) 101893
Departamento
Universidad Pública de Navarra. Departamento de Estadística, Informática y Matemáticas /
Nafarroako Unibertsitate Publikoa. Estatistika, Informatika eta Matematika Saila /
Universidad Pública de Navarra/Nafarroako Unibertsitate Publikoa. Institute of Smart Cities - ISC
Versión del editor
Entidades Financiadoras
The authors gratefully acknowledge the financial support of Tracasa Instrumental (iTRACASA) and of the Gobierno de Navarra -
Departamento de Universidad, Innovación y Transformación Digital,
as well as that of the Spanish Ministry of Science (project PID2019-108392GB-I00 (AEI/10.13039/501100011033)) and the project
PC095-096 FUSIPROD. T. Asmus and G.P. Dimuro are supported by the
projects CNPq (301618/2019-4) and FAPERGS (19/2551-0001279-9).
F. Herrera is supported by the Andalusian Excellence project P18-FR4961. Z. Takáč is supported by grant VEGA 1/0267/21. Open access
funding provided by Universidad Pública de Navarra.