• No results found

Online Process Discovery

2.4 Process Discovery

2.4.4 Online Process Discovery

Traditional process discovery algorithms, as introduced in Section2.4.2, are static, offline approaches that analyses an historical event log. Their main goal is to improve the qual- ity of the discovery results, i.e. optimising simplicity of the model and conformance with the input log (see Section2.4.1). An alternative to this approach is the online approach based on Complex Event Processing techniques (see Section2.2.3): Immediate process- ing of events when they occur to information of an higher abstraction level (in this case BP models). The motivation is to have a run-time reflection of the employed processes based on up-to-date rather than historical information which essentially allows business

analysts to react quicker to changes or occurring bottlenecks etc. in order to optimise the overall performance of the monitored processes. In accordance to this objective online process discovery algorithms have to deal with two additional challenges as opposed to the traditional process discovery algorithms:

1. The application of online process discovery is executed in a real-time setting and thus is required to conform to special memory and execution time constraints. Es- pecially with regards to many modern systems producing "big data", i.e. data that is too large and complex to simply store and process it [138]. This means in particular, that online algorithms should be able to (1) process an infinite number of events without exceeding a certain memory threshold and (2) process each event within a small and near-constant amount of time [27]. For instance, a naïve approach is logging every event and with each new occurrence (or periodically) the entire log is (re-)analysed with a traditional discovery algorithm. This approach does not com- ply with the run-time constraints for online process discovery since (1) the log grows with each event and thus will exceed the memory threshold after a finite number of events and (2) the run-time of the discovery algorithm will increase for analysing an ever-growing log. Additionally, to satisfy the run-time constraint an online pro- cess discovery algorithm needs to run autonomously without human interaction or supervision. This excludes discovery algorithms that require manual effort like the step-wise approach based on theory of region (see Section2.4.2) [243].

2. Online process discovery algorithms are required to deal with concept drift caused by dynamically changing processes during run-time (see BP Flexibility Section2.3). As discussed in the previous section real-life BPs are often subject to externally or in- ternally initiated change which has to be reflected in the results of an online process discovery algorithm. Whereas concept drift detection algorithms try to determine the time and location of these changes, online process discovery algorithms are only required to anticipate them in the sense of being reflected in the results. This solves the conceptual difficulty of detecting gradual change where the time cannot be ex- actly identified (thus making it impossible to split the log) because behaviour of two differing BPs are contained in the current stream of events. On the downside, it makes it more difficult to find a fitting model if the behaviour of the two co-existing processes is conflicting. Generally, online discovery algorithms should be able to (1) reflect newly appearing behaviour as well as (2) forget outdated behaviour. Although not specifically,incremental process mining as introduced in [37] does at- tempt to anticipate the problem of concept drift to some extent. Here, the assumption is that a log does not yet contain the entire behaviour of the process (i.e. an incomplete log) at the time of the discovery of the initial (declarative) process. Additional behaviour, that occurred after the initial discovery and captured in a second log, is analysed separately and the new information is then added to the existing BP model. This is possible due to

structure of declarative BP specifications. The process of incrementally analysing log seg- ments and then extending the BP model accordingly, i.e. incremental process mining, is motivated by the assumption that the update of an already existing (declarative) BP model is easier than to always analyse the complete log from scratch. One shortcoming of this approach is that only new behaviour is added but outdated behaviour is not removed. Another approach calledincremental worklfow mining is based on the same principle but does discover and adapt a Petri Net by incrementally processing log segments [119– 121]. It is a semi-automatic (and prototypical) approach specifically designed for dealing with process flexibility in Document Management Systems that does not anticipate in- complete or noisy logs. A third incremental approach is presented in [219] which utilises the theory of regions to create transition systems for successive sub-logs and eventually transform them into a Petri Net. Albeit based on a slightly different concept, incremen- tal process mining approaches can be considered for online process discovery: The event processing could be designed to group a number of successive traces into sub-logs which are then individually analysed and incrementally update and extend the BP. However, a conceptual weakness of incremental mining approaches is the lacking ability of forgetting old behaviour and thus not supporting, for instance,revolutionarychanges as discussed in Section2.3.

In the context of online process discovery, a synonymous term sometimes used is

Streaming Process Discovery (SPD). SPD was coined by Burattin et al. in [27]. In their work the HeuristicsMiner [278](see Section2.4.2) has been modified for this purpose and a comprehensive evaluation of different event stream processing types was carried out. The fundamentals of the HeuristicsMiner remain the same but the footprint in form of the Directly-Follows Relation Count (see Figure2.17) is dynamically adapted or rebuilt while processing the individual events. From this the Causal Net (or the Petri Net with an additional transformation) is periodically extracted, e.g. for every event or every 1000 events, using the traditional HM methodology. For instance, for the evaluation of the dif- ferent streaming methods the HM discovery was triggered every 50 events [27]. Three different groups of event streaming methods have been implemented and investigated:

Event Queue: The basic methodology of this approach is to collect events in a queue which is representing a log that can be analysed in the traditional way of process discov- ery. Figure2.19shows three basic types of this methodology: (1) In thesliding window

approach the queue is a FIFO (First-In-First-Out), i.e. when the maximum queue length (queue memory) is reached for every new event inserted, the oldest event in the queue is removed. The picture shows the development of the log/queue for each triggered discov- ery analyses. (2) In theperiodic resetapproach the queue is reset whenever the maximum queue length is reached. (3) Theuniform splitrepresents a special case of the sliding win- dow or periodic reset approach when the queue memory equals the discovery frequency. This approach was not analysed by Burattin et al. but was included into the figure as a naïve baseline scenario. Note, that the approaches using the hypothesis test for the con-

BP Influence in Stream: (3) Uniform Split: BP1 BP2 time (1) Sliding Window: (2) Periodic Reset: Moment of Change L1 L2 L3 L4 L5 L6 L7 L3 L4 L6 L5 8 L7 L L8 L1 L2 L3 L4 L7 L8 Discovery Frequency Queue Memory L2 L1 Discovery Frequency Queue Memory Discovery Frequency Queue Memory L5 L6

Fig. 2.19 Different Types of Event Queues [27]

cept drift detection algorithms, e.g. [23,276], are based on a (shifting) uniform log split. The main advantage of these approaches is that the event queue can be regarded as event log and allows for other discovery/mining analyses on event logs such as mining of the performance perspective. Two of the main disadvantages are: Each event is handled at least twice: once to store it in the queue and once or more to discover the model from the queue; Also, it does only allow for a strict interpretation of "history" either the event is still in the queue or not. In the first case an older event has the same influence as a newer event. Essentially, the event queue approach is a simple method to use offline process discovery solutions to solve the online process discovery problem.

Stream-specific Approaches: Stream-specific approaches already process events into footprint information, i.e. queues of a capped size hold information about the latest oc- curring activities and directly-follows relations. When a new event occurs all values in the queues are updated and/or replaced. Burattin et al. distinguish between the following three update operations: (1)Stationary, i.e. the queues function as a "sliding window"

over the event stream and every queue entry has the same weight, (2)Ageing, i.e. the weight of the latest entry is increased and the weights of older entries in the queue are decreased, and (3)Self-Adaptive Ageing, i.e. the factor with which the influence of older entries decreases is dependent on the fitness of the discovered model in relation to latest events stored in an additional sample queue of a fixed size: quickly decreasing for a low fitness and slowly decreasing for a high fitness. Generally, stream-specific approaches are assumed to be more computation-balanced since events are only handled once and di- rectly processed into footprint information (as opposed to the event queue approaches where the discovery cycle contains all computation-heavy algorithms) [27]. Burattin et al. also argue that ageing-based approaches have a more realistic interpretation of "history" since older events have less influence than newer events. One disadvantage is that the footprint is captured through a set of queues with a fixed size: if this size is set too low behaviour is prematurely forgotten (i.e. removed from the queue); if this size is set too high you might never forget certain information. The effect of a too large queue length is mitigated for the ageing HM approaches since their weight continuously shrinks and is eventually deemed irrelevant noisy behaviour by the core HM algorithm.

Lossy Counting: Lossy Counting is a technique adopted and modified from [139] that uses approximate frequency count and divides the stream into a fixed number of buckets.

The evaluation of these methods was carried out for three example processes of differ- ent sizes (8-19 activities) but without looping constructs involved. These were simulated and two of them were modified during execution at specific points in time. The quality of the results was measured using the criteria outlined in Section2.4.1: Fitness [3], preci- sion [155], generalisation, and simplicity. The conformance measures fitness, precision, and generalisation were computed on the basis of a queue of a fixed size containing the last events representing the sample log (similar to the sliding window approach). The conclusions, however, were mostly drawn with regards to the fitness(recall) and preci- sion criteria. Additionally, computation time and memory usage were investigated. Sum- marised, the evaluation yielded the following results: With regards to the computation- time and memory consumption the streaming-specific approaches (stationary, ageing, self-adaptive ageing) were outperforming the event queue approaches (sliding window, periodic reset) especially for higher queue lengths and the lossy counting approach. How- ever, only the time for processing an event was investigated, not the time needed for the discovery of the process. The self-adaptive ageing, for instance, would require more time since conformance criteria have to be computed to adapt the ageing factor. Considering the conformance criteria, all approaches performed similarly well if the correct param- eters were chosen (e.g. ageing factor, queue lengths, number of buckets) for stationary streams and converged to the best fitting BP after some time. For streams with concept drift the streaming approaches and lossy counting performed better than the event queue approaches. For the larger example the stationary streaming approach (with the same

weights) achieved a significantly higher fitness which can be explained with the experi- ment setup: The conformance checking treads every event in the sample queue equally independent of their "age".

Another approach for discovering concept drifts on event streams of less relevance to the thesis’ topic is presented in [138]: A discovery approach for declarative process models using the sliding window approach and lossy counting to update a set of valid business constraints according to the events occurring in the stream.

2.5 Performance Decision Support for Business Processes