Feedback Loop for Distributed Workflows - Scalable Execution of Self-managed CPS Workflows

6. Scalable Execution of Self-managed CPS Workflows

6.4. Feedback Loop for Distributed Workflows

In order to illustrate the applicability of the proposed MAPE-K framework and Feedback Service component in other contexts, we apply the framework to the distributed execution of subprocesses on a mobile service robot (Turtlebot) as execution peer [SHA17]. Figure 6.5 shows the adaptation of the Autonomic Manager to man- age the process resource (Turtlebot ) executing a subprocess. The exemplary subprocess “Retrieve Insulin Injection” stems from the process shown in Figure 5.10. This subprocess consists of multiple process steps instructing the service robot (Turtle- bot) to drive to different destinations. The successful execution of this subprocess is rather crucial, but the robot is vulnerable to various physical and technical errors and obstacles, e. g., the drainage of its own battery, the navigation algorithms cancelling due to unknown obstacles or loss of orientation, or the loss of WiFi connectivity. In this example, we show the execution of the MAPE-K feedback loop to check the reachability of the mobile robot and execute compensating actions in case it gets disconnected from the network. This application of the MAPE-K mechanism shows that, besides criteria concerning Cyber-physical Consistency, also QoS and arbitrary other constraints can be defined to verify and possibly adapt the process execution with our suggestion of using the MAPE-K feedback loop for self-management of workflows. Additional scenarios and applications of the MAPE-K principles in the context of distributed process executions can be found in Chapter 7.

6.4. Feedback Loop for Distributed Workflows

6.4.1. Knowledge Base

In this scenario, we use the context model from the Knowledge Base (KB) that refers to the Turtlebot robot as presented in Section 4.3.3. The robot acts as a peer in the peer–super-peer architecture (cf. Section 5.5.1). It is able to execute instances of process steps due to PROtEUS running on the robot and it publishes various runtime metrics (e. g., last heartbeat and battery levels). The KB contains information regarding super-peers and its associated peers as well as the process instances that are running on super-peers and peers.

Figure 6.6.: Interaction between PROtEUS and the Feedback Service to Execute Distributed Processes.

6.4.2. Goal and Objective

Listing 6.3 shows the goal and objective for this use case. It refers to the point of time of the robots last liveliness signal (heartbeat) and the execution state of the particular subprocess. The satisfied condition states that the execution went well if the subprocess’s state is “executed” (Line 5). On the other hand, the compensation condition states that there is a need for initiating the search for a compensation when the last heartbeat was longer than 5 seconds ago and the subprocess was still “executing”, i. e., the robot is very likely to have lost its network connection while executing the process instance (Line 6). The context path is specified to retrieve the peer’s (Turtlebot1) last heartbeat and the execution state of the subprocess (Lines 7–9). The identifier of the subprocess instance is used to link the instance to the specific execution peer (RUNS ON relation). Figure 6.6 shows the interaction between the PROtEUS components and the Feedback Service for this setup. The execution of the MAPE-K feedback loops for this scenario is depicted in Figure 6.7. 6.4.3. Deployment and Instantiation

The overall process is modelled in accordance with Figure 5.10. “Turtlebot1” is specified as the responsible process resource and the goal modelled for the “RetrieveIn- sulinInjection” subprocess corresponds to Listing 6.3. As this example process is

Figure 6.7.: Sequence Chart for the MAPE-K Loop applied to the Distributed Robot Process.

6.4. Feedback Loop for Distributed Workflows

Listing 6.3: Goal and Objective for Distributed Subprocess Execution on Turtlebot. 1 " R e t r i e v e I n s u l i n I n j e c t i o n " : { 2 " n a m e " : " e x e c u t i o n c o n f o r m a n c e and l i v e l i n e s s " , 3 " o b j e c t i v e s " : [ 4 { " n a m e " : " h e a r t b e a t < 5 s e c o n d s and e x e c u t e d " , 5 " s a t i s f i e d C o n d i t i o n " : " # p r o c e s s S t a t e == ’ e x e c u t e d ’ " , 6 " c o m p e n s a t i o n C o n d i t i o n " : " # t i m e F r o m (# h e a r t B e a t ) . i s B e f o r e (# now . m i n u s S e c o n d s (5) ) and # p r o c e s s S t a t e == ’ e x e c u t i n g ’ " 7 " c o n t e x t P a t h s " : [ 8 " M A T C H ( n : N e o P r o c e s s { p r o c e s s I d : ’ R e t r i e v e I n s u l i n I n j e c t i o n ’}) -[ r : R U N S _ O N ] - >( p : N e o P e e r ) " , 9 " R E T U R N n . s t a t e AS p r o c e s s S t a t e , p . l a s t H e a r t b e a t AS h e a r t B e a t " 10 ] } ] }

executed as a distributed process, the PROtEUS workflow system on the super- peer (D-PROtEUS ) instantiates the main process and begins its execution. Once it reaches the subprocess in question, it evaluates the process resource attribute, searches for the corresponding peer and the Distribution Manager of the super-peer sends the subprocess to Peer1 (Turtlebot1). This requires the Turtlebot1 peer to be connected to and registered with the super-peer and the PROtEUS WfMS also to be running on this peer. In parallel to sending the request to execute an instance of the subprocess to the peer, the super-peer D-PROtEUS system invokes the Feedback Service (FB Service) with the goal to execute the MAPE-K loop for this distributed subprocess instance, which is marked as managed process step (cf. Section 4.5.2).

6.4.4. Monitor

Upon receiving the request from the super-peer, the Feedback Service starts moni- toring the data from the Knowledge Base (KB) as defined in the context path (CP). Peer1 sends periodic status updates regarding the state of the subprocess execution. This information is used to update the process execution related data (“pro- cessState”) contained in the KB and the message’s timestamp is used to set the “heartBeat” property of the peer to indicate its last liveliness signal. Both values are monitored by the FB Service. With every relevant update of these values (symptoms), the Analyser is triggered to evaluate the data.

6.4.5. Analyse

The Analyser processes the symptoms with respect to the compensation condition and satisfied condition. It will not initiate any actions if the process is in state “executing” and the last liveliness signal was received less than 5 seconds ago. Figure 6.7 shows that after sending three status updates, the Turtlebot1 peer is not publishing messages anymore. Eventually, this leads to the Analyser sending a change request to search for a compensation to the Planner as the compensation condition becomes

Listing 6.4: Compensation Query regarding Distributed Process Execution. 1 M a t c h ( o r i g i n a l : N e o P r o c e s s ) 2 W H E R E ID ( o r i g i n a l ) = { p r o c e s s N o d e I d } 3 W I T H o r i g i n a l 4 5 M a t c h ( r e m o t e : N e o P r o c e s s ) -[ r e m o t e F o r : R E M O T E _ F O R ] - >( o r i g i n a l ) 6 W I T H remote , o r i g i n a l 7 8 M a t c h ( o r i g i n a l ) -[ r u n s O n S u p e r : R U N S _ O N ] - >( o r i g i n a l P e e r : N e o P e e r ) 9 W I T H o r i g i n a l P e e r , remote , o r i g i n a l 10 11 M a t c h ( r e m o t e ) -[ r u n s O n R e m o t e : R U N S _ O N ] - >( r e m o t e P e e r : N e o P e e r ) 12 W I T H r e m o t e P e e r , o r i g i n a l P e e r , remote , o r i g i n a l 13 14 M A T C H ( n e w P e e r : N e o P e e r ) 15 W H E R E n e w P e e r < > o r i g i n a l P e e r AND n e w P e e r < > r e m o t e P e e r 16 17 R E T U R N o r i g i n a l , r e m o t e P e e r , n e w P e e r

true after 5 seconds of not receiving any new symptoms and the process still being in state “executing”.

6.4.6. Plan

In the context of a peer failing during the distributed execution of a subprocess instance, the Planner searches for an alternative peer registered with the super-peer to repeat the execution of the subprocess instance. The compensation repository contains the compensation query shown in Listing 6.4 that will be executed in this context. From the data related to the subprocess execution on the peer (Lines 5–6), it basically tries to determine the corresponding super-peer and main process on the super-peer (Lines 8–9). It then tries to find a new peer for execution that is distinct from the super-peer and the failed peer (Lines 11–17). This alternative peer (Turtlebot2 ) is transferred as part of the Change Plan to the Executor.

The strategy of selecting an alternative peer to repeat the execution in case the original peer fails is used here for the purpose of simplifying the explanations. More sophisticated strategies and compensation queries regarding the capabilities of the respective devices and also the progress of the process execution on the failed peer should be considered for this and other cyber-physical scenarios. If Turtlebot1 already retrieved the injection and fails on its way to the resident, then the repetition of the entire subprocess by Turtlebot2 could cause additional problems as there may not be a second injection available in the shelf. As already discussed in Sec- tion 6.2.6, a more sophisticated classification of errors and corresponding compensation strategies needs to be developed to ensure the successful execution of processes and feedback loops in CPS.

In document Self-managed Workflows for Cyber-physical Systems (Page 152-157)