• No results found

Correlation based Physarum Learner (C-PhyL)

3.4 Conclusion and future work

A novel structure learning algorithm for Bayesian networks has been introduced which uses a totally new approach on estimating independences from data. A bio-inspired math-ematical model called Physarum Solver is used to find paths over pairwise correlations that explain a correlation between two variables best, assuming that the direct correlation between the two variables is unknown. Using this strategy, the algorithm tries to find higher order correlations. The novel method called C-PhyL was described and algorithm

Logarithm4of4execution4time4in4milliseconds

Number4of4instances

C-PhyL LAGD Tabu4Search 14

12

10

8

6

4

2

0

1000 10,000 100,000 1,000,000 10,000,000

Figure 3.6: Logarithm of execution time in milliseconds as a function of the number of instances for benchmark network Asia using C-PhyL, LAGD and Tabu Search.

parameters have been studied. It was observed that there are two major parameters influencing the learning quality of C-PhyL, γ and function f (Q) which is used within the Physarum Solver . Next, C-PhyL has been compared to three state of the art structure learning methods on a set of artificially generated benchmark networks with different net-work characteristics. It turned out, that the novel algorithm performs adequate for some networks but could not reach the quality of the state of the art learning methods regard-ing the Bayesian score and the quality of learned arcs. But, results showed that C-PhyL learns less extra arcs compared to other methods indicating a higher specificity of arcs. It was further observed, that for most correctly identified arcs, the correct direction could not be estimated by using an ordering based approach. To determine the ordering, the simple score based approach was used. In future work, more advanced methods might be tested to find a better way of determining a valid ordering. Another method to direct arcs within the C-PhyL algorithm is to use the PC-Algorithm introduced by Spirtes et. al.

[188] which tries to find directions of a network skeleton by performing independence tests. Another, and probably the most nearby technique (as fluxes in the real slime mold do have directions) is to update the Physarum Solver algorithm by considering Kirch-hoff’s second law saying that the directed sum of potential differences around any closed network is zero. This law has to be valid also for the flow of flux in the Physarum Solver where the flow in a closed loop of tubes has to flow in the same direction. By updat-ing the transportation equation accordupdat-ingly to consider this law, the Physarum Solver itself would be able to determine the directions in which the sol flows. The direction of the final edges in the Bayesian network could therefore be derived from the Physarum Solver . A problem that has to be solved considering this approach is that Kirchhoff’s

law needs to be applied to closed circles while Bayesian networks are forbidden to include circles by definition. In the initial state, where the Physarum-Maze is fully connected, each connection is automatically part of several loops. But, with getting connections cut out of the Physarum-Maze in each Physarum Solver iterations, the Physarum-Maze transforms into a graph with less and less loops until the final state includes only a path.

Thus, defining Kirchoff’s laws becomes more difficult with proceeding iterations and is not obvious once most loops have been cut out. For that reason, a modified Physarum Solver which is able to also learn directions by using Kirchoff’s second law is referred to as future work. Another problem of the C-PhyL algorithm can be observed in Figure 3.5a, where the learned network structure of the insurance benchmark network is shown.

It can be noticed, that node Theft is not connected to any other node although Algo-rithm 3 tries to avoid unconnected nodes. The reason for that is that node Theft never occurs as a child node. For future experiments, Algorithm 3 needs to be updated so that connections are also added to the final Bayesian network if a connections parent node is yet unconnected.

Execution time analysis showed a disadvantage for C-PhyL with increasing number of nodes but a benefit if the number of instances grows dramatically. C-PhyL has to determine the pairwise correlation coefficients only once while score based algorithms have to touch the dataset in each optimization step. Weka’s implementation does store bayesian networks and datasets within objects while the self-made implementation of the Physarum Solver holds Physarum-Maze and dataset informations in arrays. Thus, within C-PhyL, several copies from Weka objects to arrays have to be made which is tremendously time and memory intensive for larger datasets. Further, the linear equation system that has to be solved for each iteration within the Physarum Solver to calculate pressure values is solved by using an implementation of a singular value decomposition method while Tero et. al. reported that they have used Incomplete Cholesky Conjugate Gradient which might be faster. Another approach to speed up C-PhyL is to use shuttle streaming as introduced by Siriwardana et. al. [182] who showed that the Physarum Solver could be much faster when using their proposed modifications. Once unprofitable implementation issues are resolved, the C-PhyL algorithm has high potential to be used for datasets including many instances as for example streaming data.

Another benefit of C-PhyL is that the algorithm does not require a maximum in-degree. Score based structure learning methods are restricted to maximum number of parents per nodes as score calculation becomes infeasible with a growing set of parents.

Further, C-PhyL can be easily adopted to be used with continuous variables as Cram´er’s V correlation coefficient can be replaced for example by the common Pearson correlation coefficient. Of course, the representation of conditional probability distributions has to be updated to work for continuous data, too.

CHAPTER 4