Charles Explorer logo
🇬🇧

High availability in D-Bobox

Publication at Faculty of Mathematics and Physics |
2013

Abstract

Using a distributed environment for data stream processing brings many challenges, especially when requiring an exact result from processing of big data. A distributed system is more vulnerable to failures as hardware crashes, software errors, or network malfunctions.

Loss of node current state and loss of intermediate results due to node failure results in the restart of the computation, which increases the time of the computation and its cost and this is therefore unacceptable. Achieving high availability (HA) of such system brings some challenges.

In this paper, we introduce our framework for parallel and distributed processing, D-Bobox, and its requirements on high availability implementation. We also describe the main high availability methods used today and discuss their applicability in our framework.

Finally, we propose a solution how to obtain high availability in D-Bobox.