Accuracy - Improving the model management workflow

This section evaluates the results that the tool generates. This means that it evaluates if the results are correct, and if this is not the case, what the impact is of this failure on the tools credibility. The first section is about validating the set of found differences. Are there differences missing or are there differences which are not really differences? The next section is about verifying the impact analysis: does it show false positives (it says to be operationally different, but it is not) or false negatives (it does not say to be operationally different, but it is)? Finally, there is a section about the impact score. Does it indicate an impact score that might be expected by persons that do not know how the impact score system works?

5.3.1 Verifying differences

To test if the differences that are delivered by the tool are decent, Controllab provided some models from customers, that were divided into several versions. Because of non-disclosure, these models and the customers are not published. However, in Table 5.4 an overview is given of all found differences sorted by type and transformation operation (add, update or delete). Furthermore, this table also indicates for each updated submodel if it was equationally changed according to the equations dialogue or not. Finally, the last row indicates submodels that were indicated as difference in the table, but in fact did not show any differences.

These unchanged submodel differences are often caused by adding junctions, splitters or adders in 20-sim. These have the goal to do something with all inputs or all outputs, without specifying the exact variables for the inputs and outputs. For example, the adder adds all inputs, and places the sum on the output. The equations also only show that the sum of all inputs has to be taken. Removing one input from this sum thus does not show up in the equations, however it is definitely a valid difference.

Another thing to note is that fortunately all operationally changed equation models were also labelled as such; there was no false negative. Indicating that a submodel did not operationally change, even though it actually did can give the user the wrong indication about the relevance of this difference.

The failures in removing a port (which were actually additions of a port, but shown as removals of a port), and the two updates of graphical models that failed are just faulty, and have to be fixed.

5.3.2 Verifying impact scores

Currently, there are four factors that determine the overall impact score. These four factors were already discussed in section 4.7. Because these factors are either true or not, the total amount of combinations possible with these four factors is 24 = 16 combinations. This means that every submodel difference is part of one of these 16 states. Because the tool does not support adding user-based learning examples yet via the graphical user interface, the result of being in a certain state always results in the same impact score. All these currently fixed impact scores are shown in Table 5.5 for their corresponding state.

no implementation no parameter no implementation parameter implementation no parameter implementation parameter no equation no port 0% 20% 20% 50% no equation port 30% 60% 60% 80% equation no port 50% 70% 70% 80% equation port 70% 90% 90% 100%

Table 5.5: Table that indicates the given impact scores for all 16 possible states. Note that ”no” here means that there was no difference of the given type, thus the first column has no implementation change and no parameter change, whereas the first row has no operational changes to the equations and no changes in the ports of that submodel.

These impact scores are based on initial guesses towards simulation impact. Each of the 16 possible states currently has 4 learning examples connected to it. These learning examples are manually chosen impact scores that would suite the impact state. Based on these learning examples, a state is inserted into the tool. The tool then computes the impact score based on the learning examples it knows. This is how the impact scores of Table 5.5 were computed.

It is already possible with the current algorithm to add any number of factors to the set of factors to determine the impact score. These factors can also have any range. The current factors were all boolean (either true or false, thus a range of values from 0 to 1), meaning they have a range of 2 numbers. However, it is possible to add for example a percentage change between the simulation results in model version 1 versus the simulation results in model version 2, and indicate it within a range of 0 to 100 percent. Furthermore, given the current four factors, it is also possible to change the outcome by adding other learning examples. Even though the graphical user interface does not support this yet, the actual code underneath has no problem with this. It is for example possible to ask several users to indicate for a set of differences how relevant they think the difference is. If these examples would be added to the set of learning examples, then a much broader view on what is a proper impact score can be given. The ”Discussion” and ”Conclusions and Recommendations” sections continues to try to find a better way of identifying impact depending on the actual reason for making the model.

Chapter 6

Discussion

The original main goal of this master assignment was defined as follows: ”To ease the proces of model management for a user that would like to focus on the development of models rather than the management of them.”. Even though a step has been made in obtaining this goal, there is still quite some work to do to get to a level of model management that is minimal for the user. This master assignment did provide the reader with an analysis on how to use an existing version control system as the basics of this model management trajectory. Furthermore, the comparison tool satisfies the second subgoal, that was defined as: ”giving the user a means to compare two versions of the same model”. This step has been made into a proof of concept, showing that it possible to use 20-sim and compare two versions of a model for differences.

What this master assignment however did not provide is a full solution to the two other subgoals, even though both were touched upon. The first subgoal was defined to ”give the user a means of easily storing and retrieving versions of a model”. Some scripts were made in the version control system GIT that showed that this was possible, and that all actions of a file-based version control system could be reused for a model-based version control system with the exception of model comparison. However, a full-blown implementation of model-based version control was not developed during this master assignment.

The third subgoal was defined as: ”giving the user a means to link his model to external factors as results, code, tests, and requirements”. This factor was investigated during the literature research in the form of the concepts traceability and provenance. These concepts will be discussed in line with this master assignment in the section below about literature research.

In document Improving the model management workflow (Page 44-46)