Landing transactional support on the cloud

View/ Open
Date
2013Author
Version
Acceso abierto / Sarbide irekia
Type
Proyecto Fin de Carrera / Ikasketen Amaierako Proiektua
Impact
|
nodoi-noplumx
|
Abstract
We took the Master thesis of I. Arrieta-Salinas and M. Louis Rodríguez as a starting point for
this project. We are going to deploy a distributed database to be used in a cloud environment
as a specific case of Platform-as-a-service.
We assume that data is partitioned and several replicas store a copy of a given partition.
The clients issue transactions by means of a standard library such as ...
[++]
We took the Master thesis of I. Arrieta-Salinas and M. Louis Rodríguez as a starting point for
this project. We are going to deploy a distributed database to be used in a cloud environment
as a specific case of Platform-as-a-service.
We assume that data is partitioned and several replicas store a copy of a given partition.
The clients issue transactions by means of a standard library such as JDBC. To do so, they
need information about the data placement that is managed by a Metadata Manager. The
Metadata Manager manages the partitioning and the replica placement among all replicas
building a replica cluster on each partition. The replication cluster has a few replicas running
a replication protocol to provide strong consistency and the rest receive the propagation of
updates in a lazy manner. These replicas are logically constituted as onion layers around the
core replicas running a given replication protocol.
The implementation of this system had several drawbacks that we try to fix in this work.
First of all, clients an the MM need to be physically in the same machine which leads to a
penalty performance in heavily loaded scenarios. The system was optimized for YCSB that consisted in transactions with a single operation and they are run over two replication
protocols: primary copy and active replication that are known to perform badly update
intensive scenarios. Moreover, there was no load balancing at all according to replica performance, it was merely a round-robin policy among all replicas at the core level.
We try to argument the system limitations (described in more detail in Section 2.1) and
to going into the system implementation. This is going to be explained in the rest of this
work.
The main goals of this project are focused in the different parts of the system. In regard
to the Client Module, originally the client was the OLPT-Benchmark, a module that consist
in send specific types of transactions to the system by a JDBC connection. In the actual
version this module has been modified allowing to the transaction to have more than one
operation and several parameters has been introduced to the transaction which allow to the
system to treat them differently. Respecting to the Metadata Manager one of the main goals
between the others developed in this project is the decentralization of the Client and Meta-
data Manager modules physically. The rest of modifications are the creation of a structure
that allow to the Metadata Manager to know the architecture of the Replicas Cluster and the
development of a new ReplicaChooser function based on the CPU charge allowing a correct
load balancing. And finally in the Replicas Cluster has been implemented new protocols that
have permitted to run different replication protocols in different partitions simultaneously
without the knowledge of the Client and the Metadata Manager. [--]
Departament
Universidad Pública de Navarra. Departamento de Ingeniería Matemática e Informática /
Nafarroako Unibertsitate Publikoa. Matematika eta Informatika Ingeniaritza Saila
Degree
Ingeniería en Informática /
Informatika Ingeniaritza