• No results found

Challenges Posed by GRiST Mind Maps

In FreeMind, the terms ‘node’ and ‘concept’ refer to the text in any branch, and will be represented thus:

[a mind map node]. The BOI at the centre of any mind map is the root node, or simply the root. Various main risk categories radiate from the root in Figure 1.4; one such concept expresses Risk To Others, [RTO]. That so-called parent node branches into various children, such as [state of mind], which in turn is the parent of [hopeless]. In that respect, a path comprises the nodes that must be traversed from the root to reach any specific node. Along with [hopeless], two further shaded nodes, [substance misuse] and [impulsive], are called ‘leaf nodes’ due to having no branches. The GRiST mind map in Figure 1.4, then, appears to conform to the rules of mind mapping, being a hierarchy of nodes radiating from a central root. Closer inspection, though, reveals differences of varying importance, as shown next.

1.5 Challenges Posed by GRiST Mind Maps

An important difference between the GRiST mind map from Figure 1.4 and the ideal from Figure 1.3 is a lack of images and colours. Such lack of adherence to the first rule of mind mapping, though, is irrelevant;

rather than mind maps as an aid to memory, or for making presentations, GRiST mind maps serve as a precursor to a database of mental health knowledge. The emphasis here, then, is on allowing machines, rather than humans, to retrieve and process knowledge from mind maps. In that respect, just concepts expressed as words are important; omitting the images required by rule one is of no consequence. Neither is the lack of adherence to rule two, which demands a single BOI; the root node [condensed 16] is not really a BOI, but a label in the emerging database. Rather, it is the single-word concepts demanded by rule three that are of crucial importance; concepts in GRiST mind maps often contain several words, contravening that requirement. In illustration, long nodes from the sub-hierarchy under [RTO] that were hidden earlier in Figure 1.4 are revealed overleaf in Figure 1.5:

1.5.CHALLENGESPOSEDBYGRISTMINDMAPS

Figure 1.5: Detail from a mind map from GRiST

31

1.5. CHALLENGES POSED BY GRIST MIND MAPS

The mind map from Figure 1.5 shows that some ‘concepts’ in GRiST mind maps were actually quite complex ideas. While such richness provided GRiST researchers with a wealth of information about mental health risks, it hinders any fully automated approach to interpreting mind maps. Indeed, the two nodes highlighted in Figure 1.5 contained so many words that they are shown truncated. Even in shorter nodes, several concepts might contribute to an overall idea. This thesis does not aim to reconfigure such nodes, as the manual refinement reported by Buckingham and Adams (2006) had already rendered them into an acceptable form. Accordingly, the root of the mind map from Figure 1.4 contained the word

‘condensed’.

Rewriting phrases as single words would, in fact, reduce mind maps to a Bag-Of-Words (BOW) that fails to represent semantic relationships (Bekkerman & Allan, 2005). In that light, insisting on single-word nodes risks losing information. Although key concepts in GRiST mind maps must be identified, the aim here is to map them, rather than to reconfigure individual nodes. Indeed, that rule is counter-intuitive, as detail of the ‘perfect’ mind map from Figure 1.3 shows; that concerns the idea of news being ‘hot off [the] press’, as Figure 1.6 demonstrates:

Figure 1.6: Detail from Figure 1.3 of the ‘perfect’ mind map devised by Buzan (2008).

The separate nodes [hot], [off] and [press] from Figure 1.6, though, do not capture the intended meaning. Indeed, insisting on single-word nodes would require humans and machines alike to reconstruct any original meaning, during which nodes, and accordingly ideas, might be recombined incorrectly. Per-mitting more expressive nodes avoids that problem of interpretation for humans, and faithfully represents intended ideas. Nodes such as [hot off the press] should be permitted, as those words collectively represent a specific idea; the same argument applies to nodes in GRiST mind maps.

1.5. CHALLENGES POSED BY GRIST MIND MAPS

Although GRiST allowed such multi-word nodes, the transformed knowledge tree had to be pruned in order to remove redundant nodes, and to ensure that structure’s integrity; that reflected principles from designing relational databases (Buckingham, Ahmed, & Adams, 2007). That association might, in fact, be taken further; indeed, mind maps and galateas are seen here as belonging to a wider family of

‘semantic networks’ that are refined by a process of normalisation (Mylopoulos, 1998). Although GRiST researchers started that process, mind maps from that project might themselves be normalised further;

that would encourage their use as formal repository of knowledge, rather than just as precursors to the database of mental health knowledge comprised by galateas.

Richer ideas in mind map nodes, though, hinder any automated interpretation. While optimal for viewing by humans, such longer nodes hide key concepts from machines; such concepts must, then, be isolated by researching words in various ways. In fact, what might seem a trivial problem detracts from such lexical analyses. Specifically, spelling mistakes mean that concepts might be overlooked; the problem deepens on examining responses from a freely available spelling checker; inappropriate corrections that arise would be misleading if allowed to pass unchallenged.

A further obstacle to analysing mind maps automatically concerns the various forms that words take;

nodes might contain related words that are hidden from machines by textual variations. The term stem will be used from this point on to denote such short forms. The process of stemming, then, involves mapping words to some base form (Brants, 2003). Determining the longest sub-string that identifies related words, while excluding others, will lead to groups of words that express any particular underlying concept. Stems derived on linguistic principles, though, are restricted to specific languages; the desire here is for a more flexible approach based on textual similarities alone. Such an ability to quantify differences between text strings will further help machines to select appropriate spelling corrections.

In addition to stemming related word forms, the actual meanings of words affect the treatment of concepts from GRiST mind map nodes. Distinctions between, say, nouns and verbs indicate whether concepts represent things or actions, respectively. Unfortunately, the tool chosen for analysing words in this thesis cannot perform that task reliably; ambiguous words, in particular, impede attempts to determine exact concepts. To overcome that problem, a novel approach is applied to deciding appropriate word-usage automatically. That approach analyses words that appear immediately before and after certain prepositions; subsequently, patterns in the distribution of known actions and things around those prepositions will help to resolve ambiguous cases. However, the term ‘heuristic’ best describes applying reliable knowledge in that way: the search is for guidelines, rather than for absolute rules.

A further problem concerns structural variations between individual GRiST mind maps. In fact, the template used to create mind maps evolved over time, and took several iterations to stabilise. Ultimately,

1.5. CHALLENGES POSED BY GRIST MIND MAPS

agreement between three independent researchers exceeded 90% on the template categories (Buckingham

& Adams, 2006). While that evolutionary process helps to explain differences in hierarchical structure evident in GRiST mind maps, it remains a challenge in respect of those collected mind maps as a formal semantic network. In addition to repeating identical nodes at various levels in those mind maps, nodes that express similar concepts often do so in slightly different wording. Processing mind maps automatically, then, demands that machines reconcile such variations.

In order to overcome that problem, the lexical tool already mentioned will be combined with a multivariate statistical analysis; resulting clusters of related nodes emerging from that analysis will identify nodes bearing words of similar forms, or of related meanings. Those clusters indicate node hierarchies for automatically generated, idealised mind maps. The result will constitute a refined information base, as well as being a more formal representation for human researchers. Rather than producing a sole, static combined mind map, though, the proposed approach generates such idealised node hierarchies on demand.

New mind maps will be created automatically to represent the expression of particular concepts across the collection of GRiST mind maps, by means of an enhanced FreeMind interface. Having described the domain of this thesis, then, the overview that follows describes the contents of forthcoming chapters.