• No results found

This section presentsCEPSimfoundation concepts on top of which the simulation algorithms are implemented. First theCEPSimquery model, which is used to define the simulated queries, is discussed. Following, the event set and event set queue abstractions are described.

6.3.1 Query Model

CEPSim uses AGeCEP as its formal foundation. Therefore, every user-defined query q is

represented by an attributed directed acyclic graphG = (V,E,ATT), where each vertexv2 V

represents a query element and the edges (u,v) 2 E represent event streams flowing from an

element u to another elementv. In addition, the set of vertices V is partitioned into Vp, Vc, andVorepresentingevent producers, event consumers, andoperators respectively. Figure 6.2 shows an example of a queryq. Some attributes have been omitted for the sake of clarity.

CEPSim overcomes CloudSim batch application model by using AGeCEP query model,

which can represent complex data processing flows consisting of multiple interconnected steps. In addition, as discussed in Section 4.4.1, most existing CEP query languages can be converted to theAGeCEPmodel, which emphasizes the generic aspect ofCEPSim.

Moreover, CEPSimextendsAGeCEP representation in order to make it more appropriate for simulations. First, every vertex is extended with a new attribute ipe, which represents the number of CPU instructions needed to process a single event. This is an important piece of information required by the simulation algorithms. For event producers, this attribute estimates the number of instructions required to take an event from the system input and forward it to query execution. In other words, it does not include the e↵ort required to generate the event

because event generation does not usually occur within the CEP system.

Second, every edge (u,v) 2 E is extended with a selectivity attribute that determines how many of the events processed by u are actually sent tov. In Figure 6.2, the query edges are

Time

window window

advance

W1 W2

Figure 6.3: Windowed operator attributes.

annotated with their selectivity values. For instance, edgese4 and e5 selectivity are both 0.5.

Therefore, if s1 processes 100 events, 50 of them will be sent to f1 and the other 50 to f2. A

selectivity can be greater than one in the case where the operator outputs more than one event based on a single input, e.g., creating two alarms from a single sensor reading. Note that in

AGeCEP, selectivityis also a vertex attribute that refers to the total number of events that are

output as a function of the number of input events. In other words, the vertexselectivityis the sum of all its outgoing edgesselectivity.

Third, CEPSim also introduced the“windowed”stereotype to characterize operators that process windows of events and combine them in some manner. Typical examples are aggrega- tion operators that count events or calculate the average value of attributes. This new stereo- type is necessary because the simulation of windowed operators is implemented by a di↵erent

algorithm that requires information not included in the regular“operator” stereotype. In par- ticular, windowed operators have three extra attributes: a window size, an advance duration, and acombinationfunction.

Figure 6.3 illustrates the windowandadvance concepts. Thewindowspecifies the period of time from which the events are taken and the advance duration defines how the window slides when the previous window closes. Finally, thecombinationfunction is defined as:

f :Rm0! R 0 (6.1)

wheremis the number of operator predecessors. This function regulates the number of events that are sent to the output given the number of events accumulated in the input. Commonly, it is defined as a constant function f(~x) = 1, meaning that for each window only one event is

generated (e.g., for counting events).

Finally, every event producer pinCEPSimis associated with agenerator functiongp that determines the total number of events produced by p given a point in time. Formally, the generator function is defined as a monotonically increasing function from the time domain to

the set of positive integers:

gp :R 0 !N, s.t. x ythengp(x)gp(y) (6.2)

6.3.2 Event Sets

Anevent setis an abstraction that represents a batch of events and is the basic processing unit

used by CEPSim. This abstraction has been created to improve the simulator performance and to assist in calculating the simulation metrics. Operators exchange event sets instead of individual events, and all system queues and temporary bu↵ers are composed ofevent sets.

Formally, an event set e is an instance of an EventS et class that contains the following attributes2:

• cardinality (cn): number of events in the set. The notation|e| is used hereinafter as a

shortcut fore.cn.

• timestamp (ts): a timestamp associated with the set, which can be used for various pur-

poses. Most often, it contains the timestamp at which the set has been created.

• latency (lt): the average of the latencies of the events in the set. Event latency is defined

as the period of time elapsed from the event creation to the moment at which the event is added to the set.

• totals (tt): a map that, for each producervp 2 Vp, stores the number of events that must

have been produced byvp to originate the events currently in the set. The goal of this attribute is to trackcaused by(oris result of) relationships between the events in the set and the produced events.

In addition to these attributes, four operations are also defined for event sets: sum,extract,

select, andupdate.

• Sum: is applied to two event setse1ande2and results in a new event setercontaining all

events from both sets. It is defined as:

er= e1+e2 (6.3a)

such that |er|= |e1|+|e2| (6.3b) er.ts= |e1| ·e1.ts+|e2| ·e2.ts |e1|+|e2| , (6.3c) er.lt= |e1| ·e1.lt+|e2| ·e2.lt |e1|+|e2| , (6.3d) er.tt:Vp !R 0, s.t. er.tt[vp]=e1.tt[vp]+e2.tt[vp] (6.3e)

• Extract: is applied to an event set e and the number of events to be extracted n. The

results are an event seterconsisting of the extracted events, and an event setemcontaining the remaining events frome,

(er,em)=e n (6.4a) such that |er|=n (6.4b) er.tt :Vp! R 0, s.t. er.tt[vp]= (n/|e|)·e.tt[vp] (6.4c) |em|=|e| n (6.4d) em.tt :Vp! R 0, s.t. em.tt[vp]=e.tt[vp] er.tt[vp] (6.4e)

and the latency and timestamp attributes fromer andemare the same as ine.

• Select: is applied to an event seteand a selectivity s. It selects a subset of events from

the event set:

er= e⇤s (6.5a)

such that

|er|= |e| ·s (6.5b)

and the remaining attributes fromerare the same as ine.

• Update: is applied to an event seteand a timestamp ts. It simply brings the event set

latency and timestamp up to date:

Algorithm 6.1:Event set queue - dequeue operation.

Data: .Q,Event set queue

.n,Number of events to be extracted

1 functiondequeue(Q, n)

2 e empty event set

3 whilen>0and!isEmpty(Q)do

4 h dequeue(Q) // Extract the head of the queue Q

5 if |h|>nthen

6 (h,r) h n

7 prepend(Q, r) // Return r to the head of the queue Q

8 end 9 e= e+h 10 n n |h| 11 end such that er.ts= ts (6.6b) er.lt= e.lt+(ts e.ts) (6.6c)

and the remaining attributes fromerare the same as ine.

6.3.3 Event Set Queues

Anevent set queueis simply a queue where the elements are event sets. As with any regular

queue, it is possible to enqueue and dequeue elements in afirst-in, first-outmanner. In addition, an event set queue has an overloaddequeueoperation that receives the number of events to be extracted and returns an event set representing these events.

Algorithm 6.1 shows this operation in pseudo-code. The algorithm removes event sets from the queue Quntil the resulting event set ereaches size n. When the removed event set hhas more events than what is required to complete n, the algorithm extracts the necessary events fromhand returns the remaining events to the queue (lines 5-8).

Finally, an event set queue Qalso has acardinalitydefined as the sum of the cardinalities of all event sets in the queue:

|Q|= X

e2Q

f3 p3 f4 m3 a1 c2 m1 p1 p2 s1 f2 f1 m2 c1 p4 Placement1 Query1 Query2 Vm1 Vm2 Placement2

Figure 6.4: Placement definitions.