Attacking bitcoin anonymity: generative adversarial networks for improving bitcoin entity classification
Fecha
2022Autor
Versión
Acceso abierto / Sarbide irekia
Tipo
Artículo / Artikulua
Versión
Versión publicada / Argitaratu den bertsioa
Impacto
|
10.1007/s10489-022-03378-7
Resumen
Classification of Bitcoin entities is an important task to help Law Enforcement Agencies reduce anonymity in the Bitcoin
blockchain network and to detect classes more tied to illegal activities. However, this task is strongly conditioned by a
severe class imbalance in Bitcoin datasets. Existing approaches for addressing the class imbalance problem can be improved
considering generative adversa ...
[++]
Classification of Bitcoin entities is an important task to help Law Enforcement Agencies reduce anonymity in the Bitcoin
blockchain network and to detect classes more tied to illegal activities. However, this task is strongly conditioned by a
severe class imbalance in Bitcoin datasets. Existing approaches for addressing the class imbalance problem can be improved
considering generative adversarial networks (GANs) that can boost data diversity. However, GANs are mainly applied in
computer vision and natural language processing tasks, but not in Bitcoin entity behaviour classification where they may be
useful for learning and generating synthetic behaviours. Therefore, in this work, we present a novel approach to address the
class imbalance in Bitcoin entity classification by applying GANs. In particular, three GAN architectures were implemented
and compared in order to find the most suitable architecture for generating Bitcoin entity behaviours. More specifically,
GANs were used to address the Bitcoin imbalance problem by generating synthetic data of the less represented classes
before training the final entity classifier. The results were used to evaluate the capabilities of the different GAN architectures
in terms of training time, performance, repeatability, and computational costs. Finally, the results achieved by the proposed
GAN-based resampling were compared with those obtained using five well-known data-level preprocessing techniques.
Models trained with data resampled with our GAN-based approach achieved the highest accuracy improvements and were
among the best in terms of precision, recall and f1-score. Together with Random Oversampling (ROS), GANs proved to
be strong contenders in addressing Bitcoin class imbalance and consequently in reducing Bitcoin entity anonymity (overall
and per-class classification performance). To the best of our knowledge, this is the first work to explore the advantages
and limitations of GANs in generating specific Bitcoin data and “attacking” Bitcoin anonymity. The proposed methods
ultimately demonstrate that in Bitcoin applications, GANs are indeed able to learn the data distribution and generate new
samples starting from a very limited class representation, which leads to better detection of classes related to illegal activities. [--]
Materias
Bitcoin address classification,
Class imbalance problem,
Entity anonymity attack,
Entity classification,
Generative adversarial networks (GAN)
Editor
Springer
Publicado en
Applied Intelligence (2022)
Departamento
Universidad Pública de Navarra/Nafarroako Unibertsitate Publikoa. Institute of Smart Cities - ISC
Versión del editor
Entidades Financiadoras
This work has been partially supported by the Spanish Centre for the Development of Industrial Technology (CDTI) under the project ÉGIDA (CER20191012) - RED DE EXCELENCIA EN TECNOLOGIAS DE SEGURIDAD Y PRIVACIDAD.