Data stream clustering: introducing recursively extendable aggregation functions for incremental cluster fusion processes

dc.contributor.authorUrío Larrea, Asier
dc.contributor.authorCamargo, Heloisa A.
dc.contributor.authorLucca, Giancarlo
dc.contributor.authorAsmus, Tiago da Cruz
dc.contributor.authorMarco Detchart, Cedric
dc.contributor.authorSchick, L.
dc.contributor.authorLópez Molina, Carlos
dc.contributor.authorAndreu-Pérez, Javier
dc.contributor.authorBustince Sola, Humberto
dc.contributor.authorDimuro, Graçaliz Pereira
dc.contributor.departmentEstadística, Informática y Matemáticases_ES
dc.contributor.departmentEstatistika, Informatika eta Matematikaeu
dc.contributor.departmentInstitute of Smart Cities - ISCen
dc.date.accessioned2025-03-21T09:54:40Z
dc.date.available2025-03-21T09:54:40Z
dc.date.issued2025-03-07
dc.date.updated2025-03-21T09:49:35Z
dc.description.abstractIn data stream (DS) learning, the system has to extract knowledge from data generated continuously, usually at high speed and in large volumes, making it impossible to store the entire set of data to be processed in batch mode. Hence, machine learning models must be built incrementally by processing the incoming examples, as data arrive, while updating the model to be compatible with the current data. In fuzzy DS clustering, the model can either absorb incoming data into existing clusters or initiate a new cluster. As the volume of data increases, there is a possibility that the clusters will overlap to the point where it is convenient to merge two or more clusters into one. Then, a cluster comparison measure (CM) should be applied, to decide whether such clusters should be combined, also in an incremental manner. This defines an incremental fusion process based on aggregation functions that can aggregate the incoming inputs without storing all the previous inputs. The objective of this article is to solve the fuzzy DS clustering problem of incrementally comparing fuzzy clusters on a formal basis. First, we formalize and operationalize incremental fusion processes of fuzzy clusters by introducing recursively extendable (RE) aggregation functions, studying construction methods and different classes of such functions. Second, we propose two approaches to compare clusters: 1) similarity and 2) overlapping between clusters, based on RE aggregation functions. Finally, we analyze the effect of those incremental CMs on the online and offline phases of the well-known fuzzy clustering algorithm d-FuzzStream, showing that our new approach outperforms the original algorithm and presents better or comparable performance to other state-of-the-art DS clustering algorithms found in the literature.en
dc.description.sponsorshipThis work was supported in part by FAPERGS under Grant 24/2551-0001396-2, Grant 23/2551-0001865-9, and Grant 24/2551-0000723-7; in part by CNPq under Grant 304118/2023-0 and Grant 407206/2023-0; in part by FAPERGS/CNPq under Grant 23/2551- 0000126-8; in part by CAPES, FAPESP under Grant 2022/09136-1; in part by MCIN/AEI/10.13039/50100011033/FEDER; and in part by UE under Grant PID2022-136627NB-I00 and Grant Santander-UPNA.
dc.format.mimetypeapplication/pdfen
dc.identifier.citationUrio-Larrea, A., Camargo, H., Lucca, G., Asmus, T., Marco-Detchart, C., Schick, L., Lopez-Molina, C., Andreu-Perez, J., Bustince, H., Dimuro, G. P. (2025). Data stream clustering: introducing recursively extendable aggregation functions for incremental cluster fusion processes. IEEE Transactions on Cybernetics, 55(3), 1421-1435. https://doi.org/10.1109/TCYB.2025.3527862.
dc.identifier.doi10.1109/TCYB.2025.3527862
dc.identifier.issn2168-2267
dc.identifier.urihttps://academica-e.unavarra.es/handle/2454/53787
dc.language.isoeng
dc.publisherIEEE
dc.relation.ispartofIEEE Transactions on Cybernetics (2025), vol. 55, núm. 3
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2022-136627NB-I00/ES/
dc.relation.publisherversionhttps://doi.org/10.1109/TCYB.2025.3527862
dc.rights© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other work.
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess
dc.subjectData streamsen
dc.subjectFuzzy clusteringen
dc.subjectSimilarity measuresen
dc.subjectOverlap indicesen
dc.subjectAggregation functionsen
dc.titleData stream clustering: introducing recursively extendable aggregation functions for incremental cluster fusion processesen
dc.typeinfo:eu-repo/semantics/article
dc.type.versioninfo:eu-repo/semantics/acceptedVersion
dspace.entity.typePublication
relation.isAuthorOfPublicationeccff35a-2248-43bc-bd3f-f2daff964d42
relation.isAuthorOfPublication8c79084b-8af8-4913-a958-52ca175bd136
relation.isAuthorOfPublicationa5f4053a-a8c2-41e3-91c2-2b9dad6a72fd
relation.isAuthorOfPublicationb1df82f9-2ce4-488f-afe0-98e7f27ece58
relation.isAuthorOfPublication1bdd7a0e-704f-48e5-8d27-4486444f82c9
relation.isAuthorOfPublication.latestForDiscoveryeccff35a-2248-43bc-bd3f-f2daff964d42

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Urio_DataStream.pdf
Size:
2.36 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description: