Multiple classification ripple round rules: classifications as conditions

(1)

Multiple Classification Ripple Round Rules:

Classifications as Conditions

A dissertation submitted to the Faculty of Science, Engineering and Technology, University of Tasmania in fulfilment of the requirements for the Degree of Doctor

of Philosophy.

Ivan Karl Bindoff

BComp (Hons, First Class)

(2)

Statement of Originality and Access Authority

This dissertation contains no material which has been accepted for the award of any degree or diploma by the University of Tasmania or any other tertiary institution, except by way of background information and duly acknowledged in this dissertation, and to the best of the candidate’s knowledge and belief, this dissertation contains no material previously published or written by another person, except where due acknowledgement is made in the text of the dissertation.

This dissertation may be made available for loan and/or limited copying in accordance with the Copyright Act 1968.

Ivan Karl Bindoff June 2010

Statement of Ethical Conduct

The research associated with this thesis abides by the international and Australian codes on human and animal experimentation, the guidelines by the Australian Government’s Office of the Gene Technology Regulator and the rulings of the Safety, Ethics and Institutional Biosafety Committees of the University.

(3)

Abstract

The Ripple Down Rules (RDR) approach was developed by Compton and Jansen (Compton and Jansen 1989; Compton and Jansen 1992) to effectively remove the maintainability concerns of expert systems. This method was used to create an advanced expert system to assist in the performance of medication reviews. However, work in this area, although very successful, led to the realisation that the RDR method did have its drawbacks, since with this method it was no longer possible to define rules which were dependent on the presence or absence of a classification or classifications.

Previously, attempts were made to address this, with Recursive RDR (Mulholland 1995), Nested RDR (Beydoun and Hoffmann 1997) and Repeat Inference MCRDR (Compton and Richards 1999) all deserving acknowledgement in this regard. However, all of these approaches had their own shortcomings. Recursive RDR suffered problems with cyclic rule definitions, and was very domain specific (Mulholland 1995). Nested RDR was concerned more with the idea of intermediate classifications, rather than the more general problem of being able to define a rule based on the presence/absence of a classification or classifications (Beydoun and Hoffmann 1997; Beydoun and Hoffmann 2001). Repeat Inference MCRDR tackled the general problem, but its approach at preventing cycles – to not allow the retraction of assertions – fundamentally limits the scope of rules which can use classifications as conditions. In addition to this, there is some minor concerns as to the efficiency of the inference strategy, which simply repeatedly inferences the knowledge base until no further changes to the outputs are detected (Compton and Richards 1999; Finlayson 2008).

(4)

(5)

Acknowledgements

To my supervisor, Byeong Ho Kang: To you I am thankful for many wonderful things, such as guidance, direction and support. You seem to believe very strongly in my abilities, and that belief, in turn, gives me reassurance which is sometimes sorely needed. However, I am also thankful for a number of not-so-nice things. You are perhaps the best player of the role of devil’s advocate that I know, even if it drove me into fits of intense frustration at times. I am forced to acknowledge that it did very effectively arm me with the tools I need to defend my ideas – probably even against a barbarian horde, were it necessary. A warning to any future PhD candidate of Byeong’s – he will rip your ideas apart mercilessly and with an infuriating feigned ignorance, but it’s for your own good.

To my supervisor, Gregory Peterson: You took a fairly hands off role throughout my candidature, which was probably a good thing considering how much Byeong was already nagging me, and considering how small a part medication review ended up playing in this thesis. However, when push came to shove, you really came through for me in a big way. I am very grateful for this support, and I hope that you feel you have been suitably rewarded for placing your trust in my abilities, and have no (or at least few) regrets. I assure you that I intend to capitalise on the opportunities you have laid out before me.

To my partner, Vanessa Wronski: You are an endless source of amusement and encouragement. You balance me. When completing a thesis it is normally the role of your partner to be proud of you, but I am instead proud of you. On balance you worked harder than me during these past four years, yet together we managed to stay quite within the acceptable bounds of sanity. We’ve lived together for almost 4 years now, and next week we move into our first home together. I can only hope that this adventure will be as good as our previous ones.

(6)

To my friend, Tristan Ling. I’m singling you out because you chose a similar path to me, and as such were a source of valuable discussion and debate. Talking through my ideas with you helped me flesh them out, and learn how to explain them better. I’m sorry I haven’t been able to help you more with yours yet, but trust that I will make myself available to do that when you need me to. Thank you also for proof reading this thesis, the favour will be paid back.

(7)

Figures

Figure 2-1 A simple set of rules. ... 32

Figure 2-2 A complete set of facts. ... 32

Figure 2-3 A simple fuzzy rule set. ... 33

Figure 2-4 The difference between knowledge expressed by the expert, and the knowledge as it must be represented in a standard knowledge base (Compton and Jansen 1989). ... 35

Figure 2-5 The case based reasoning cycle (Aamodt and Plaza 1994). ... 40

Figure 2-6 A simple RDR knowledge base, where arrows pointing upwards indicate the TRUE path while arrows heading downwards indicate FALSE paths. 42 Figure 2-7 For a case [X=5, Y=5, Z=10] the emphasised rules are those which were evaluated, while the highlighted rule is the one which ultimately fired. ... 42

Figure 2-8 An example of a compound classification in RDR. ... 47

Figure 2-9 The previous example of a compound classification RDR knowledge base converted to MCRDR. ... 48

Figure 2-10 The difference list approach (Kang 1995). ... 52

Figure 2-11 The general MCRDR knowledge acquisition process. ... 53

Figure 3-1 An ATC code, example shown being Furosemide (Frusemide in Australia). We can also determine from this code that it is a high-ceiling diuretic in the Sulfonamides group. ... 80

Figure 3-2 The growth charts of both knowledge bases. ... 86

Figure 3-3 Conditions per rule. ... 87

Figure 3-4 Accuracy of the provided classifications. ... 89

Figure 3-5 The total number of cornerstone cases found for each rule. ... 92

Figure 3-6 The number of conditions added per rule in order to eliminate all cornerstone cases. ... 93

Figure 3-7 Time per rule. ... 94

Figure 3-8 Time per case. ... 95

Figure 3-9 The deviation from the original number of classifications found by the expert and the number found by the system after training was completed. ... 98

(12)

Figure 4-1 An example of an exception which uses a classification as a condition of its rule. This rule could not be represented with the RIMCRDR knowledge

representation scheme. ... 112

Figure 4-2 An example representation of a simple MCRRR knowledge base. .... 115

Figure 4-3 The MCRDR inference algorithm (Kang 1995). ... 116

Figure 4-4 The MCRRR inference algorithm. ... 117

Figure 4-5 A simple example of a cyclic knowledge base. ... 121

Figure 4-6 Psuedo-code for a topological sort of a directed acyclic graph (Kahn 1962). ... 122

Figure 4-7 The cycle detection algorithm used in this study. ... 123

Figure 4-8 The simplest example of a cycle. ... 124

Figure 4-9 A third example of a cycle... 125

Figure 4-10 A fourth example of a cycle. Inclusive of class not present conditions. ... 125

Figure 4-11 The growth of the pizza suggestions knowledge base... 135

Figure 4-12 The number of conditions per rule for the pizza suggestions knowledge base. ... 136

Figure 4-13 The percentage of correct classifications provided by the system for each case... 137

Figure 4-14 How many times each classification was used. ... 139

Figure 4-15 Time taken to create each rule. ... 140

Figure 4-16 The blocks which must be placed in each grid (case). Each block has an identification number 1-8 from left to right. ... 144

Figure 4-17 A fully loaded grid with 4 unavailable cells. One block remains correctly unplaced. ... 145

Figure 4-18 An example of a solution suggested by the system which has shown an overlap. ... 147

Figure 4-19 The growth rate of the blocks knowledge base. ... 150

Figure 4-20 The number of conditions per rule in the blocks experiment. ... 150

Figure 4-21 Number of classifications used as conditions per rule. ... 152

Figure 4-22 Correct classifications provided by system. Shown with a moving average with a period of 100. ... 153

Figure 4-23 The number of uses of each classification. ... 154

(13)

Figure 4-25 The number of cycles detected per rule. ... 157

Figure 4-26 The total number of alternate solutions suggested by the MCRRR method for 2000 cases after varying amounts of training. ... 159

Figure 4-27 Instances where forced additions resulted in the system finding alternate solutions. ... 160

Figure 4-28 Instances where forced removals resulted in the system finding an alternate solution. ... 160

Figure 5-1 C4.5 Algorithm (Kotsiantis 2007) ... 167

Figure 5-2 A simple example of a (bad) grouping rule. ... 179

Figure 5-3 The hindsight algorithm to “convert” MCRDR knowledge bases to MCRRR knowledge bases. ... 181

Figure 5-4 Growth of the knowledge base for the bibtex dataset. ... 186

Figure 5-5 Growth of the knowledge bases for the emotions dataset. ... 187

Figure 5-6 Growth of the knowledge base for the enron dataset. ... 188

Figure 5-7 Growth of the knowledge base for the genbase dataset. ... 189

Figure 5-8 Growth of the knowledge base for the medical dataset. ... 190

Figure 5-9 Growth of the knowledge base for the scene dataset. ... 191

Figure 5-10 The growth of the knowledge base for the yeast dataset. ... 192

Figure 5-11 The accuracy of the system relative to the simulated experts with the bibtex dataset. ... 193

Figure 5-12 The accuracy of the system relative to the simulated experts with the emotions dataset. ... 194

Figure 5-13 The accuracy of the system relative to the simulated experts with the enron dataset. ... 195

Figure 5-14 The accuracy of the system relative to the simulated experts with the genbase dataset. ... 196

Figure 5-15 The accuracy of the system relative to the simulated experts with the medical dataset. ... 197

Figure 5-16 The accuracy of the system relative to the simulated experts with the scene dataset. ... 198

Figure 5-17 The accuracy of the system relative to the simulated experts with the yeast dataset. ... 199

(14)

Figure 5-19 The average number of conditions for every 10 rule cluster in the emotions dataset. ... 201

Figure 5-20 The average number of conditions for every 10 rule cluster in the enron dataset. ... 202

Figure 5-21 The average number of conditions for every 10 rule cluster in the genbase dataset. ... 203

Figure 5-22 The average number of conditions for every 10 rule cluster in the medical dataset. ... 204

Figure 5-23 The average number of conditions for every 10 rule cluster in the scene dataset. ... 205

Figure 5-24 The average number of conditions for every 10 rule cluster in the yeast dataset. ... 205

Figure 5-25 The average depth for every cluster of 10 rules for the bibtex dataset. ... 206

Figure 5-26 The average depth for every cluster of 10 rules for the emotions dataset. ... 207

Figure 5-27 The average depth for every cluster of 10 rules for the enron dataset. ... 208

Figure 5-28 The average depth for every cluster of 10 cases for the genbase dataset. ... 208

Figure 5-29 The average depth for every cluster of 10 cases for the medical dataset. ... 209

Figure 5-30 The average depth for every cluster of 10 cases for the scene dataset. ... 210

Figure 5-31 The average depth for every cluster of 10 cases for the yeast dataset. ... 210

Figure 5-32 The total number of cornerstone cases found for each rule in the bibtex dataset. ... 212

Figure 5-33 The number of conditions added to remove each cornerstone case for the bibtex dataset. ... 212

Figure 5-34 Total cornerstone cases found for each rule in the emotions dataset. 213

(15)

Figure 5-36 The total number of cornerstone cases found for each rule in the enron dataset. ... 215

Figure 5-37 The number of conditions added to eliminate all cornerstone cases for each rule in the enron dataset. ... 215

Figure 5-38 The total number of cornerstone cases found for each rule in the genbase dataset. ... 216

Figure 5-39 The number of conditions added to eliminate all cornerstone cases for each rule in the genbase dataset. ... 217

Figure 5-40 The total number of cornerstone cases found for each rule in the medical dataset. ... 218

Figure 5-41 The number of conditions added to eliminate all cornerstone cases for each rule in the medical dataset. ... 218

Figure 5-42 The total number of cornerstone cases found for each rule in the scene dataset. ... 219

Figure 5-43 The number of conditions added to eliminate all cornerstones cases for each rule in the scene dataset. ... 220

Figure 5-44 The total number of cornerstone cases found for each rule in the yeast dataset. ... 220

Figure 5-45 The number of conditions added to eliminate all cornerstone cases for each rule in the yeast dataset. ... 221

Figure 5-46 The number of grouping rules and the reduction of conditions for the bibtex dataset. ... 222

Figure 5-47 The number of grouping rules and the reduction of conditions for the emotions dataset. ... 223

Figure 5-48 The number of grouping rules and the reduction of conditions for the enron dataset. ... 224

Figure 5-49 The number of grouping rules and reduction of conditions for the medical dataset. ... 225

Figure 5-50 The number of grouping rules and the reduction of conditions for the scene dataset. ... 226

(16)

Figure 5-52 A benchmark simulation, with 20% chance of exceptions and no rules based on classifications. A pure MCRDR simulation of this type would be expected to show a linear growth. ... 235

Figure 5-53 The same benchmark simulation, allowing Excel to use a higher order polynomial than necessary. ... 235

Figure 5-54 A simulation with a 10% chance of exceptions and a 10% chance of rules based on classifications. ... 237

Figure 5-57 A simulation with a 10% chance of exceptions and an 80% chance of

rules based on classifications. ... 238

Figure 5-60 A benchmark simulation with a 20% chance of exceptions and a 50%

chance of rules based on classifications. ... 240

Figure 5-61 The time taken to perform 100 inferences at various evenly distributed points through the 10% exception and 10% rules based on classification experiment with the scene dataset. ... 241

Figure 6-4 A simulation with a 20% chance of exceptions and an 80% chance of rules based on classifications. ... 259

(17)

Figure 6-7 A benchmark simulation, with 20% chance of exceptions and no rules based on classifications. ... 261

Figure 6-20 A benchmark simulation with a 20% chance of exceptions and a 50%

chance of rules based on classifications. ... 267

Figure 6-21 The number of rules in the system and the time taken to perform 100 inferences at 9 separate points during the simulated stress test runs for 10% exceptions and 20% classifications. ... 268

(18)

(19)

Tables

Table 1 ICPC-2 PLUS terms for keyword 'vascular'. ... 79

Table 2 The number of rules which had 0-4 classifications as conditions. ... 151

Table 3 The number of rules at each depth. ... 156

Table 4 The multi class datasets used, and their relevant statistics. ... 183

Multiple classification ripple round rules: classifications as conditions