This section proposes a cost model that estimates the performance of both NTM and LTM topologies. The model supports the user decision process about the parameters to be used to generate the topology.
This study considers two main limiting factors of a topology: the CPU cost (com- putational cost) within a bolt and the network cost (communication cost) across bolts. Memory cost is also discussed in Section4.5. The following discussion focuses on the bolts, since the spout only acts as a simple stream broker. The main Key Performance Indicator (KPI) is the system throughput (e.g., number of tuples that can go through a topology per time unit). We also report the results of processing latency in Section 5.
One of the most important factors that determines the throughput is the number of task parallelization. Consider NTM, the single bolt can be instantiated to nN T M number
of task instances. Let T PN T M denote the throughput of the NTM topology and T PB1 the
throughput of one task instance. T PN T M should be proportional to nN T M:
4 Solution 63
The throughput of a task instance T PB1 can be modeled as:
T PB1 = rSinput× t rSinput× t × Cost(B1) = 1 Cost(B1) , (2)
where rSinputis the aggregated stream rate for all input streams Sinput. Note that we assume
the stream rate is same for each class in the input. Section4.5 discusses the case of input streams with different rates. During a time interval t, rSinput× t gives the total number of
tuples; rateinput× t × Cost(B1) gives the total amount of processing time that it needs.
Cost(B1) is the average processing latency (CPU cost) for each tuple. By dividing the
total number of tuples and the total processing time, we can drive the throughput T PB1.
Intuitively, the throughput of a single task is inversely proportional to the processing latency Cost(B1). Each class in B1 is involved in a different number of NIs. The cost for
processing one class Ci is linearly related to number of conjunction operations it involves,
|{oCi}|. Therefore, the average cost Cost(B1) is:
Cost(B1) = n
X
i=0
|{oCi}|/|C| (3)
The above equations allow users to estimate the throughput of a NTM topology, T PN T M.
Before extending the cost model for LTM, we first make the assumption that each bolt in LTM has the same number of deployed tasks (nLT M). Setting different numbers
of tasks for different bolts can potentially improve the performance; however, it creates a new problem of finding the best configuration. Authors in [32] propose a machine learning method to solve it. In this work, we do not consider this problem to highlight the trade-offs in consistency checking.
We extend the cost model from NTM to LTM in two ways. First, in LTM there are multiple bolts. The one with the minimal throughput is the critical one since, intuitively, the throughput of the pipeline can be as high as the bottleneck bolt. Therefore, we have:
T PLT M = nLT M × min(T PB1, . . . , T PBn) (4)
Second, the throughput of an individual bolt in LTM is affected by the presence of inference operations. Their cost should be much smaller than the cost of conjunction operations, since it is a simple mapping operation. We introduce α as the ratio between the cost of inference to conjunction (α < 1). On the other hand, if two bolts B1 and B2
in LTM have exactly the same workload, but B1 is in front of B2, B2 should have a lower
throughput than B1. The reason is that each tuple has to go through B1 and the network
before reaching B2. Therefore, to include the network cost, we add an extra penalty ρ
(ρ > 1), which increases exponentially as depth of a bolt in the pipeline (e.g., Bk should
have a penalty ρk−1, since a tuple has to be forwarded(processed) by the n − 1 bolts in the
front). By combining these two aspects, the cost of an bolt Bn in LTM can be modeled
as: Cost(Bk) = ρk−1 α ×Pmi=0|{o→ Ci}|+ Pn j=0|{o∩Cj}| |C| (5)
64
Shen Gao, Daniele Dell’Aglio, Jeff Z. Pan, and Abraham Bernstein
In practice, the values of α and ρ can be measured by running a benchmark topology. As shown in the experiments, different ways of assigning NI groups to bolts lead to different performance. Given a Tbox, users can first compile a LTM topology with the highest estimated throughput. For example, the cost model can tell that the throughput of the right topology is higher than the left in Figure4b. Then, users also compile a NTM topology, compare it with the LTM topology, and choose the one with higher throughput to deploy.