Performance Statistics - Experiments with GC-ATF

Experiments with GC-ATF

8.5 Performance Statistics

8.4.3 Summary

As we discussed in ^§8.2.6, GC-ATF is creating the same information as HR, which means GC-ATF is certainly useful in performing speculative concept and conjecture formation. In all, after taking into account the lack of reviewing in GC-ATF, the two systems concept and conjecture creation is very similar. In particular, the weaknesses in GC-ATF mean there is too much uninteresting information rather than there being information that is not reported. HR’s use of objects and interest and sub-objects allows it to reject some concept creation paths. In addition, its conjecture review process removes conjectures that can be proven from existing equivalences.

Note that we didn’t use the Size production rule in either system. When we used it in HR, it generated 6 additional concepts, all regarding the number of habitats animals live in. By contrast, when we introduced our Size method, it generated substantially more, concerning other background concepts. Consequently, we did not include size in this experiment as it would have made a comparison between the systems quite difficult. We saw that Size can be used effectively in our configuration, when we investigated number theory in ^§8.3. We believe the underlying reasons for the difference between HR and GC-ATF, will be similar to those we have already seen above.

8.5 Performance Statistics

We now describe what happens during a GC-ATF run from a processing viewpoint by pre-senting a number of runtime scenarios. We describe the numbers and types of processes that are attached to the workspace and comment on how the profile of broadcasts and attached processes change with time. In these scenarios, we consider running an investigation into QG3 quasigroups to various complexity limits. The configuration of GC-ATF we used is the same as described in the worked example in^§7.7, however we included an EquivalenceSplitter (^§7.3.10) process to improve proving speed.

In figure 8.2, we show a processing profile for investigating QG3 quasigroups to a complexity

limit of 4. This process generated 5,363 broadcasts and lasted 1m 15s on a 3.0GHz Intel Pentium IV with 1GB of RAM, creating 58 NewConcept definitions and 507 proved conjectures. Along the bottom of the graph are the round numbers. The top line in the figure portrays the number of processes attached to the workspace, which peaks at 1,843 processes in round 1,715.

The bottom line portrays the number of broadcast proposals in each round. This peaks at 573 proposals in round 956. In-between these lines we have indicated, with crosses, where NewConcept artefacts were broadcast. As the diagram indicates, there is a large increase in process numbers over the early rounds. The number of processes reaches a peak and then tails off, with processing ending when the number of proposals reaches zero.

Figure 8.2: Counts in ATF Example (complexity 4).

This profile is the same in all GC-ATF runs, and we have included, as figure 8.3, a similar chart for running to complexity 5 to illustrate this. The number of processes increases whenever there is a NewConcept broadcast. Firstly, new DefinitionCreator processes are attached by the existing binary DefinitionCreator processes. These record the broadcast definition to use in combination with future NewConcept broadcasts. Secondly, all the unary DefinitionCreator processes that can use the broadcast to make a new Definition do so, and they attach

Repeat-8.5. Performance Statistics 171

Proposer processes to ensure their creation is not forgotten. Thirdly, all previously spawned ImplicationMaker processes that identify a conjecture propose it and attach a RepeatProposer.

Figure 8.3: Counts in ATF Example (complexity 5).

As we come to round 1,000, the complexity of the NewConcept definitions is approaching the limit of 4 and so there are fewer new concept proposals and the broadcast of a NewConcept arte-fact triggers fewer attachments of DefinitionCreator and RepeatProposer processes. Contrast the broadcast of a NewConcept in round 951, which resulted in 102 new processes being at-tached, with that in round 2,463, which had no impact upon the number of attached processes.

The first was complexity 3 whereas the second was complexity 4.

Once we have reached the complexity limit, concept formation tails off and there is just a backlog of proposals waiting to be broadcast. The Definitions will be broadcast in descending order of the importance that they were ascribed when created. As these are broadcast, there may be a number of conjecture proposals. However, the amount of spawning is not as significant as when large-scale concept formation was still taking place. When all these proposals have been broadcast, processing ends.

Figure 8.4 shows a breakdown of the types of each of the 5,362 broadcasts during processing.

There were 536 Definition broadcasts arising from DefinitionCreator processes. These were reviewed by DefinitionCreator and 507 were found to be unique and broadcast as Concept broadcasts. Of these, 56 were found to have no example sets, and a level 0 non-existence Conjecture was raised for them. The remaining 451 Concept broadcasts were allocated into 58 equivalence classes, each represented by a NewConcept broadcast. The remaining 393 were identified as equivalent to earlier NewConcept broadcasts and an equivalence conjecture raised instead. In addition to these 393, a further 6 equivalence Conjecture broadcasts were generated by the ConjectureReducer process, giving a total of 399 equivalence conjectures. The Implica-tionMaker process was responsible for identifying 379 conjectures where NewConcept example sets were sub-sets of each other. All these were broadcast as level 0.

Figure 8.4: Breakdown of ATF example broadcasts (complexity 4).

All the level 0 Conjectures triggered the EquivalenceSplitter process. It converted all the equiv-alences into two level 1 implication Conjecture broadcasts and elevated all other level 0 Con-jecture broadcasts to level 1 without alteration, as they require no splitting. Consequently, we saw 1,177 (2^*399 + 379) level 1 implication conjectures and 56 level 1 non-existence conjec-tures. We configured the TranslateOtter process to react to level 1 Conjecture artefacts and each of these was, subsequently, broadcast as a Conjecture including an Otter translation.

Lastly, the Prover process managed to prove 507 of these translated conjectures, broadcasting

8.5. Performance Statistics 173

them as Explanation artefacts.

During the session, a total of 5,463 processes were created. Figure 8.5 shows the breakdown into process types. By far the most common process type are the RepeatProposer processes, which are used to remember Definition and Conjecture broadcasts and there is approximately one for each of these types of broadcast artefact. In the case of the DefinitionRepeatProposer, there are slightly more than the final number of Definition broadcasts, as these detach themselves if they see their definition broadcast, rather than waiting to broadcast it. Note that we may be able to reduce the need for ConjectureRepeatProposer processes by a careful review of how processes ascribe importance to conjectures. However, this is complicated by equivalence splitting.

Figure 8.5: Breakdown of ATF example processes (complexity 4).

There are 34 each of the binary versions of the Negate and Compose DefinitionCreator pro-cesses, which are attached at the start. These each spawn new unary versions in response to NewConcept broadcasts. However, they only spawn new unary processes if their parameteri-sations are appropriate for the NewConcept definition, which is why there are far fewer than 58 ^* 34 spawned concepts. Note also that there are fewer Negate unary processes because this method leads to higher complexity concept definitions, which is taken into consideration before spawning. One additional ImplicationMaker and an additional EquivalenceReviewer is spawned, also, for each NewConcept, so there are 59 of each of these in total. The other pro-cesses are such things as Prover and ExampleFinder. These other propro-cesses don’t spawn a significant number of other processes.

Table 8.10 provides a breakdown of overall processing time. The largest single element is proving, as we discussed in^§8.2.5. Housekeeping refers to the central operation of the workspace such as attaching and detaching processing, looping through the list of attached processes and recording and selecting proposals for broadcast.

Table 8.10: Breakdown of ATF example processing time.

In document A global workspace framework for combined reasoning (Page 189-194)