• No results found

In life-critical applications such as healthcare, it is very important that services provided by a data management engine meet a prespecified set of SLAs that intrinsically encapsulate QoS goals. Common metrics of the performance of a DSMSs in meeting QoS requirements include, most importantly, latency/throughput, accuracy and resource utilization. Quality attributes constraint system functionalities, specifying a qualification (a.k.a. annotation) on

27

how those functions are performed. Such as constraining a spatial query to be performed with a low-latency.

Distributed big data management systems should treat QoS-awareness as a first-class citizen when designing their services, such that they serve in accordance with QoS properties of the SLAs. Achieving this goal is specifically challenging as it necessitates intelligently trading off several contradicting factors. A problem that is further inflated when operating in a fluctuating data stream setting, where data arrival rates oscillate between normal and peak bursts (sometimes fierce), the fact that those figures are unknown a-priori in real-time scenarios could be to blame.

There are QoS metrics that are based on time. For example, throughput and latency.

• Throughput. It is loosely defined as the count of streaming tuples that can be processed with specific computation resources during a time period. The goal is normally high-throughput. SPEs normally work by implicitly catching up with the oscillation in the data arrival rates aiming to maximize the throughput.

• Latency. Is the total time required for processing all tuples arrived during a continuous query (CQ) running session in an end-to-end fashion (i.e., passing through all the operators of a DAG operator graph) from the moment data hits the front-stage of the DSMS coming from a stream ingestion system until results are served to the user, where user chooses to stop the CQ or result outputs to the sink of the data flow graph describing the stream processing operations. The goal of the latency QoS is always lowering it.

Another QoS metric depends on the accuracy of results obtained such as:

• Estimation quality. If the scenario needs approximation, such as depending on samples instead of the population, error-bound tied to such an approximation determines the estimation quality. Higher estimation quality is the goal in this case. Also, one more QoS metric we consider in this thesis is:

• Computation resource utilization. Computation resources are assets. The abundance of extra computing resources does not necessarily mean overprovisioning

28

them (or under-provisioning them). Those resources are normally shared between various workloads and a QoS aware DSMS should seek to achieve a high resource

utilization.

Those four QoS metrics are contradicting and solving for all collectively enforces a tradeoff that can be optimized to a specific degree. It worth mentioning though that some DSMSs are working on “best effort” basis where they do not necessarily meet the QoS goals (especially time-based goals), they otherwise work to their maximum capacity trying to achieve as close to the goal specified as possible. Some other DSMSs are designed to guarantee a prespecified set of QoS goals by normally applying cost models so as to reactively (or proactively) guarantee the QoS goals. However, it worth mentioning that current DSMSs are designed to operate in a “best effort” fashion, thus not always being able to guarantee QoS goals specified by the users. A problem that is inflated in spatially-heavy streaming data loads. We otherwise aim at a system that can meet a prespecified list of QoS goals, and also can strike a plausible balance between the contradicting QoS goals. In the next subsection, we explain a general methodology that we apply for measuring the ability of the services we provide in this thesis to meet the QoS goals.

3.2.1 Methodology for Measuring the Achievement of Quality-of-Service Goals We adopt the following methodology in measuring the ability of each component (i.e., the skill) in achieving a prespecified list of QoS goals. We take a scenario-based methodology. We call our method cause/effect-tactic-measure.

The cause is the event that causes a QoS issue to arise. The effect is the effect of the QoS issue which has happened because of the cause. Tactics are the responding mechanisms that we have supported through SpatialDSMS for mitigating the effects (i.e., reversing them).

Measures are the metrics we impose to measure the ability of every approach (i.e., from the

tactics) in achieving the QoS goal.

Categorizing tactics this way allows a more systematic architectural design. Tactic selection decision depends on which way it affects the tradeoff between the participating QoS goals, and also the overall overhead of adopting this technique and whether it is mitigated in a way that renders its adoption beneficial. In other words, the cost of incorporating it does not counteract its benefits. This is because the pattern applied is a trending layered pattern, where

29

stacked up layers normally add complexity and up-front running costs to the system. Causes include spatial characteristics nature (e.g., skewness, arrival rate fluctuation). Effects include low performance (i.e., in terms of time, throughput, resource utilization, estimation quality). Tactics include element-level optimization, adaptation and approximation. Measures include performance gain, speedup, estimation quality etc. Figure 3.1 shows the workflow of the method.