Expressing data entry concepts - Attribute Instantiation IO

Attribute Instantiation IO

7.5.1 Expressing data entry concepts

During testing, users were able to use the data entry interface to record the details of the specimens they were attempting to describe, based on specialised domain models they had detailed in the specialisation process. The ability of users to express their descriptive concepts in data entry obviously depends on their specialisation of the domain model (as described in chapter 6), but also on the ability of the system’s domain model to capture the nuances of the data (e.g. multiple values). Generally, users believed they had been able to enter descriptive data which accurately described their observations of the specimens in the final wide user test (85% expressed positive opinions of this with 15% neutral and no negative opinions, although 18% of the positive opinions were conditional on the availability of modifiers). Full task test observations and informal user feedback backed up these findings, with users able to capture new specimen based data from various plant groups as well as legacy data on the ‘alyxia’ group of plants [based on Middleton 2000, 2002].

7.5.1.2 Use of ontology terms & data structure restrictions

Users are constrained to using the ontology terms included in the specialised domain model for a descriptive concept (expressed as one or more specialised attributes). Users showed no difficulties with working in the context of the structure of the domain model, as it was already understood from their earlier specialisation task.

Omissions in early versions of the angiosperm ontology were discussed in chapter 6. Such omissions had a less immediate effect upon data entry except where users’ used workarounds for missing terms in the ontology. For example users used the modifier ‘not’ to modify the value object ‘pubescent’, a measure of hairiness, to effectively score smooth surfaces, as the domain term with this meaning (‘glabrous’) had initially been omitted from the ontology. The use of such workarounds is a concern for data quality and reinforces its dependency on the comprehensiveness of the underlying ontology.

Some users did question during testing whether it would be possible to select from value objects that were supported in the ontology for the description object and attribute in question, but which had not been included in the specialised attribute value domain. Whilst the value object definitions are assumed to be independent and thus should be able to be added, the idea of having the specialisation process is to ensure that data entry is consistent for each specimen. This consistency could not be guaranteed if users were permitted to enlarge the value domain beyond the specialisation at will. The intentions of the specialisation user cannot be second guessed at data entry stage, as there may be good reasons why they have not included some value objects permitted by the wider ontology, for their more specialised project subject. If the specialisation user is the same as the data entry user and has merely overlooked an option, they do have the facility to return to the specialisation process and add that option in. This behaviour was occasionally observed in the full tests, particularly during the entering of data on the first 1-2 specimens.

7.5.1.3 Use of multiple values

In order to record their descriptive concepts accurately, users need to distinguish between AND and OR multiple values. This question ignores the issues of users being

able to make such a real-world distinction in their interpretation of the real-world specimen, which is a domain issue.

Users were observed to understand the distinction between AND-ing and OR-ing within the interface, including how to express those concepts. Using speak-aloud methodology, users were observed on a number of occasions expressing real-world descriptive concepts that equated to AND-ing or OR-ing. They were subsequently observed accurately expressing those concepts using the interface’s AND/OR multiple value facilities. Feedback in the attribute instantiation IO’s informational data display was observed to serve as a useful check for users to ensure they had accurately expressed their AND or OR concepts.

Some users did require initial explanation on how to express alternate value concepts (OR-ing) in the interface, but were able to express that desire and following a brief explanation had no observed difficulties in subsequent use of the facilities.

7.5.1.4 Use of concrete description object instances

Only limited testing with concrete instances was possible due to time constraints, however a limited narrow test at the end of the 5th phase showed experienced users were able to comprehend and utilise concrete instances without observed difficulty.

7.5.1.5 Use of measurement units

Although generally straightforward to use, there were incidents during the user tests that gave cause for concern regarding the capture of numerical data with units of measurement. Some users were observed to enter numerical data without indicating what unit of measurement they were using. This applied to 15% of users during the final wide test. These users had not entered any preferred units during the specialisation process and were simply working within what they regarded as standard practice, assuming the units would be understood although they had not indicated them.

The mapped ontology can note preferred units for attributes, these can be overridden during specialisation or data entry as required. The angiosperm ontology does not include this information however, so it must be entered by users during specialisation, to be represented in the data entry interface. During user tests, more experienced users

normally used the preferred unit facility, especially if they were intending to actually enter data for a number of specimens based on the specialisation. However that was not always the case, and one user was observed wasting a lot of time repeatedly entering units of measurement at data entry, because they had rushed the specialisation process, failing to set preferred units. A default ontology preferred unit that could be overridden would avoid this issue if a standard could be found.

Whilst there are occasions where no units of measurement are required, there should be some feedback to users that they should enter units for some attributes (e.g. ‘length’, ‘height’). The system would need to rely on the mapped ontology for the knowledge about whether an attribute should have units to properly implement such directed feedback.

7.5.1.6 Use of not scored mechanism

The not scored mechanism was observed during testing to be widely used where attributes could not be recorded due to the real-world state of the specimen. A number of users (22% in the final wide test) spontaneously commented upon the facility in a positive manner during speak aloud observation, appreciating the ability to make a positive comment that the attribute was considered but could not be scored.

During a narrow user test, users attempted to capture legacy data about the ‘alyxia’ group of plants. When capturing the legacy data, the not scored mechanism was useful for cases where the legacy descriptions omitted data or could not be clearly interpreted. A similar positive reaction was achieved regarding the special presence attribute (with one experienced taxonomist for example commenting that it was “very useful

information to have”), where users used the facilities to mark description objects that

were not present on a particular description object as absent.

7.5.1.7 Use of modifiers

The ability to give extra qualitative statements about the data they recorded was seen as very important or helpful by a substantial minority of test users. Whilst they would not be likely to be useful in automatic database comparisons, they were seen to be important for added clarity regarding descriptive observations. To one user they were “essential”

to accurately record their concepts. Most users however made only some occasional or rare use of the facility.

There was some concern that the use of some modifiers, specifically the ‘locator’ type modifiers (e.g. ‘at/on base’, ‘at/on upper surface’), would cause users to use these modifiers as an alternative to more accurate specifying of their descriptive concepts in the specialisation process. Instead of resorting to a locator modifier, users could specify the attribute for all the possible locations it could be observed in (or they could use a relative spatial modifier to relate the attribute to the other description object). Locator modifiers all referred to a universally applicable description object, such as ‘base’ or ‘upper surface’ that could exist as the child of any other description object. By using a locator modifier, the data might not be comparable with data that used the description object hierarchy. Attribute data for one specimen using locator modifiers might also not be comparable with data for another specimen if they had different locators. This would not necessarily be obvious if later comparisons were made with the data, as the comparison might ignore modifiers on the basis that they are primarily only useful for extra clarity of human interpretation.

There are however possible solutions to these issues. The locator modified data could be converted into more rigorous description object hierarchy based data, either within the domain model or when the data was mapped back to the database. To do so within the domain model would require that the transformation be recorded for each modifier in the mapped ontology.

A small number of users (15% in the final wide test for example) expressed an interest in being able to add their own notes to instantiated attributes. Such a facility could be incorporated easily, however there are some potential drawbacks. Whilst these free text notes could add clarity to the exact meaning of a user’s instantiated data, that clarity could only be for later interpretation, it could not be used for any sort of automatic comparison. A free text entry facility might also encourage users to bypass the constraints on data entry designed to uphold the ontology and data quality, in order to enter whatever they wanted without constraint.

If a notes facility was incorporated, one area to look at would be the facility not only to add text notes but also to add sketch drawings. Capturing a sketch drawing could be

done either by scanning and attaching a file or by using a quick sketch program (e.g. diva.sketch’s JSketch [Pederson 2006]). It was found during qualitative research and early storyboard development that such a drawing can be valuable in taxonomic description for relating the positions and attitudes of various elements of a specimen. In a general sense such a sketch facility could be of use in many domains, particularly where the visual medium was important and it was difficult to use text descriptions to capture every nuance of a descriptive feature or of the relationship of a set of such features.

In document Interactive visualisation tools for supporting taxonomists working practice (Page 195-200)