POST-SPIRAL UPDATE - SPECIFYING COMPONENTS

Specifying Components

CHAPTER 4. SPECIFYING COMPONENTS

4.7. POST-SPIRAL UPDATE

Figure 4.6: Spiral 1 and later Spiral updates to specification

machine learning classifier (C4.5). Through this, a flat schema (rather than hierarchical) was adopted as the default schema to simplify and scale processing of the XML files.

As a result of the review of Spiral 3, Spiral 4 attached five data types to the attributes to improve data representation. Prior to this there had been four data types, which were superceded. Further experimentation in Spiral 4 added ontologies and distance matrices to provide a knowledge base to support the schema.

Additional changes were made to the schema in Spiral 5 to accommodate the evalua-tion and ranking strategies. Specifically, the Z specificaevalua-tion within the techDescripevalua-tion element was fully defined and attributes were added for the evaluation metrics. A sum-mary of these updates follows.

Hierarchical to Flat Schema

Experience with processing data from a large repository (over 41,000 items) in Spiral 3 indicated that the flat version of the schema was more appropriate for the case studies.

CHAPTER 4. SPECIFYING COMPONENTS

The developed programs needed to parse the XML document to access the content. This may be done as a stream of information, or the entire file can be read into a Document Object Model (DOM). For large files this can cause memory problems. Also, in this case, the large XML of repository metadata is actually representing 41,000 smaller items (components) which is the level at which the tools need to work. The flat structure allowed the use of a SAX parser⁷ and filters instead of loading the entire document tree as a DOM. For this to be possible, all attribute names have to be unique and not rely on the position in the tree to differentiate between duplicates. This also aligns to the Dublin Core approach which prefers flat schemas so they can be read without too much knowledge of the XML document structure.

Even in a flat schema, the abstract data model is the same, with any changes from the original implemented through extensions and enhancements. The final version of the template is described in Section 4.7.1 of this Chapter. The changes after Spiral 1 relate to the use of ontologies instead of faceted classification and the inclusion of the Z specification. Some renaming of fields also took place. Future requirements were also considered, for example there are attributes related to the certification and testing of soft-ware components. So, although the related data was not available for this investigation, certification and authentication are attributes in the specification.

Formal Specification

In Spiral 1 interfaces and static evaluation of components were considered sufficient for the project. As the CdCE Process developed (Spiral 2), it was decided to use dynamic evaluation (testing) and the specification was changed to include behavioural information.

This change was in the ideal specification for requirements, while interface information could still be included as before. The behavioural specification was used to manually generate tests until Spiral 5, where some specifics of the representation were clarified and a test generation tool developed. The intent had been to explore the use of the behavioural specification to provide more complex test generation (based on a formal specification) and test oracle functionality. This is regarded as future work and the test generation is currently based on interfaces and equivalence classes.

Development of the CdCE Process clarified the requirements for the technical and

7Simple API for XML allows the XML files to be treated as a series of events (tags/attributes) and the developer is free to interpret them

4.7. POST-SPIRAL UPDATE

behavioural specification of components, resulting in the inclusion of a Z specification in the model. The techDescription field is used to store the Z specification. UML, interface and other specification information could also be held under techDescription.

Spiral 5 moved the focus to the behavioural specification and resulted in clarification of how the Z specification was stored and processed.

In the schema template, the definitions of the contents of the techDescription have changed. While the tags are not affected by the change, the internal content is. The techDescription after Spiral 5 includes the Z specification, coded in the L^ATEX standard.

This provides interface, partition and context information for the test generation.

Supporting Ontologies and Classification Schemes

The evaluation and reflection after Spiral 3 indicates that results could be improved if the data representation was enhanced. It was clear that more could be extracted from the data if classifications or ontologies were used for the terminology included in each attribute. Numeric and date values would also benefit from more tailored treatment.

Spiral 4 introduced ontologies to the dataset along with transforming software to regulate the incoming data.

Although this had more impact on the tools developed for processing the data, it required a tightening of the definitions of the attributes and the values they could hold.

Much of the ontology is based on the freshmeat Open-Source repository and related trove categories.

Metrics

Spiral 5 focussed on the details of the evaluation. For this, nine new attributes were added to the schema for the evaluation metrics. All of them are numeric, with values ranging between 0 and 10. The value in the ideal specification will indicate the optimal value, and preferred range (e.g. optimal 10, range 7-10).

The candidates are evaluated in Steps 4 - 6, populating the metrics of interest. The application developer indicates their required values through the ideal specification and these are used to rank the candidates in Step 7 of the CdCE Process.

CHAPTER 4. SPECIFYING COMPONENTS

4.7.1 Updated Schema Implementation

The following discussion of the implementation refers to the final version of the spec-ification template. Tables 4.8 to 4.10 list the fields and contents for the final swvML schema.

The table indicates groupings of attributes to indicate type. Unlike the initial version, the groupings do not correspond to complex attributes (containing sub-attributes).

While this flat version of the schema is in keeping with the Dublin Core style of schema development, the choice of flat or hierarchical schema when developing instance documents is left to the preferences of each organisation. Software reading schema data directly will need to address the two versions of the schema, or may pre-process the data through XSLT.

As part of the enhancements in Spiral 4, types of attributes in the specification have been expanded, and tools were provided along with a knowledge base for ontology attributes.

Item Description Type Schemes Min Max

Occurs Occurs Identification

title name of the component xs:string DC 1 1

version current version number xs:string 1 1

date date released xs:string DC & 1 1

ISO8601

language component language (en) xs:string DC & 0 1

RFC1766

publisher publishers of component address DC 1 *

identifier publisher identifier xs:string DC 1 1

source URI to access component xs:string URI 1 *

Description

description function of the component xs:string DC 1 1

detail more detailed description xs:string 0 1

type xs:string DC 0 *

format xs:string DC 0 *

subject classification/category xs:string 0 *

Commercial 0 1

price price and currency type xs:string 0 *

licence licence covered by price xs:string 0 *

rights URL for copyright info xs:string DC;URI 0 1

demos URL of demo s/w xs:string URI 0 1

supportLevel level of support provided xs:string 0 1

documentation URL for documentation xs:string URI 0 1

sourceCode source code access info xs:string 0 *

Table 4.8: Attributes and types in swvML schema (Part 1/3)

4.7. POST-SPIRAL UPDATE

Item Description Type Schemes Min Max

Occurs Occurs Technical

technical technical description xs:string 0 1

devStatus development status enum 0 1

devLanguage development language xs:string 0 1

operatingSystem operating system xs:string 0 1

framework software framework xs:string 0 1

standard standards adhered to xs:string 0 *

platform h/w platform xs:string 0 *

processor processor requirements xs:string 0 *

memory RAM requirements xs:string 0 1

diskSpace disk space requirements xs:string 0 1

relation related software xs:string DC 0 *

rel id xs:string 0 *

FFIT Functional fit xs:string 0 1

FEXS Functional excess xs:string 0 1

AEFT Adaptation effort xs:string 0 1

TFIT Testing fit xs:string 0 1

TRES Test result xs:string 0 1

CX P Performance testing xs:string 0 1

CX R Reliability testing xs:string 0 1

CX S Stress testing xs:string 0 1

CX U Usage testing xs:string 0 1

Table 4.9: Attributes and types in updated swvML schema (Part 2/3), new/changed items in bold

Item Description Type Schemes Min Max

Occurs Occurs Contact

Creator developers of component address DC 1 *

Support support providers address 0 *

Address 0 1

Street street address xs:string 1 1

City city xs:string 1 1

State state xs:string 1 1

PostCode postcode xs:string 1 1

Email email address xs:string URI 0 1

Fax fax number xs:string 0 1

Phone telephone number xs:string 0 1

URL URL for org. web site xs:string URI 0 1

Certification 0 1

testing testing info xs:string 0 *

t level level attained xs:string 0 1

t organisation testing organisation xs:string 0 1

certification certification information xs:string 0 *

c level level attained xs:string 0 1

c organisation certification organisation xs:string 0 1

Table 4.10: Attributes and types in swvML schema (Part 3/3)

CHAPTER 4. SPECIFYING COMPONENTS

4.8 Summary

This Chapter has described the investigation of RE1: the development of a component specification template, along with some of the rationale for the decisions that were made in Spiral 1 of the investigation. To position the schema, use cases were developed and the stakeholders perspective became a key influence on the type of data required. Since components are viewed as electronic resources, the specification aligns with a widely used standard, Dublin Core. The schema is implemented as an XML Schema, and provides a predominately flat XML schema, which adheres to Dublin Core policy and simplifies the processing of very large files.

This schema is expected to evolve, although care has been taken to include informa-tion felt to be important to all stakeholders in the component based software development community. Extending the schema is simple through XML and schema developers are encouraged to help identify fields that may be brought into the schema standard in the future.

The evaluation of the specification indicates that it has satisfied the stakeholder re-quirements. The specification is central to the selection process which is developed and explored in the following Spirals. As such, it has evolved, as described in Section 4.7. The contributions discussed in this Chapter are the development of the specification template (C1) and the inclusion of context in the specification (C3), both of which show their value in later Spirals.

The following Chapter explores RE2, the development of the process for component selection. Spiral 2 builds on the specification from Spiral 1 and creates a process to provide scope for automation and intelligence.

4.8. SUMMARY

Chapter 5

In document Strategies for the intelligent selection of components (Page 164-171)