Publication:
Generative adversarial networks for bitcoin data augmentation

Date

2020

Authors

Zola, Francesco
Bruse, Jan Lukas
Etxeberria Barrio, Xabier

Director

Publisher

IEEE
Acceso abierto / Sarbide irekia
Contribución a congreso / Biltzarrerako ekarpena
Versión aceptada / Onetsi den bertsioa

Project identifier

European Commission/Horizon 2020 Framework Programme/740558openaire

Abstract

In Bitcoin entity classification, results are strongly conditioned by the ground-truth dataset, especially when applying supervised machine learning approaches. However, these ground-truth datasets are frequently affected by significant class imbalance as generally they contain much more information regarding legal services (Exchange, Gambling), than regarding services that may be related to illicit activities (Mixer, Service). Class imbalance increases the complexity of applying machine learning techniques and reduces the quality of classification results, especially for underrepresented, but critical classes.In this paper, we propose to address this problem by using Generative Adversarial Networks (GANs) for Bitcoin data augmentation as GANs recently have shown promising results in the domain of image classification. However, there is no 'one-fits-all' GAN solution that works for every scenario. In fact, setting GAN training parameters is non-trivial and heavily affects the quality of the generated synthetic data. We therefore evaluate how GAN parameters such as the optimization function, the size of the dataset and the chosen batch size affect GAN implementation for one underrepresented entity class (Mining Pool) and demonstrate how a 'good' GAN configuration can be obtained that achieves high similarity between synthetically generated and real Bitcoin address data. To the best of our knowledge, this is the first study presenting GANs as a valid tool for generating synthetic address data for data augmentation in Bitcoin entity classification.

Description

Keywords

Address behaviour, Bitcoin classifier, Class imbalance, Data augmentation, Generative Adversarial Network

Department

Institute of Smart Cities - ISC

Faculty/School

Degree

Doctorate program

item.page.cita

F. Zola, J. L. Bruse, X. E. Barrio, M. Galar and R. O. Urrutia, 'Generative Adversarial Networks for Bitcoin Data Augmentation,' 2020 2nd Conference on Blockchain Research & Applications for Innovative Networks and Services (BRAINS), 2020, pp. 136-143, doi: 10.1109/BRAINS49436.2020.9223269.

item.page.rights

© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other work.

Los documentos de Academica-e están protegidos por derechos de autor con todos los derechos reservados, a no ser que se indique lo contrario.