Input Data Quality Report
Review of Targe Creek - Sutherland PEM IDQ
Report
Submitted to the Ministry of Sustainable Resource Management By David E. Moon, CDT Core Decision Technologies Inc.
0. EXECUTIVE SUMMARY... 1
1. INTRODUCTION... 1
1.1. BACKGROUND... 1
1.2. RATIONALE FOR THE REVIEW... 2
1.3. SCOPE OF THE REVIEW... 2
1.4. ORGANIZATION OF THE REVIEW... 2
1.4.1. Introduction ... 2
1.4.2. The review proper ... 2
1.4.3. Conclusions ... 3
1.5. ELEMENTS OF THE REVIEW... 3
1.5.1. Organization and Presentation ... 3
1.5.2. Field Sampling and Field Data... 3
1.5.3. Input Data Layers ... 3
Compilation Metadata...3
Spatial Quality of Input Data...4
Thematic Data Input Quality ...4
Thematic Compilation and Derivation...6
1.5.4. Knowledge Base... 7
2. REVIEW... 7
2.1. ORGANIZATION AND PRESENTATION... 7
2.2. FIELD SAMPLING AND FIELD DATA... 7
2.3. INPUT DATA LAYERS... 8
2.3.1. Input Data Sources... 8
Compilation Metadata for Input Maps...8
Spatial Quality of Input Maps...8
Thematic Quality of Input Maps ...8
2.3.2. Input Data Derivation / Compilation... 9
Quality Control / Quality Assurance...9
Attribute Metadata ...10
2.3.3. Input Data Summary Report...10
2.4. KNOWLEDGE BASE AND ALGORITHM REQUIREMENTS...12
2.4.1. Entities ...12
2.4.2. Knowledge Base Attribute Compilation...12
Cross Product Resolution...12
Sliver adjustment...12
2.4.3. Belief Matrices ...12
Sensitivity Analysis...12
2.4.4. Knowledge Base Validation...12
Test Description ...12
Test Results ...12
0.
Executive summary
It must be noted that the Input Data Quality standards put forth in the Predictive Eco-system Mapping Inventory Standard were, by design, non-prescriptive and flexible enough for PEM practitioners to meet the intent of the standard without being forced to follow rigid formats, formulas, or procedures. The hope was that the standard would encourage critical evaluation of input data by PEM practitioners and encourage its re-porting in a clear, concise, and complete manner. It is therefore difficult to provide a pass/fail evaluation for IDQ reports.
In addition, the Targe Creek – Sutherland IDQ report was conducted as a test of the PEM Inventory Standard and so does not represent a typical PEM IDQ report. It should, however, incorporate both the letter and the intent of PEM Inventory Standard. The report comes closest to meeting the letter of the PEM Inventory Standard for In-put Data Quality Reporting but does not meet the intent.
1. Documentation of metadata for the input data layers was incomplete. While not required in the standard, the history or pedigree of Forest Cover map revisions, base map conversion, and retrofit procedures was ignored. The report gives UTM as the TRIM projection when the data is distributed in Albers. The conversion from Albers to UTM is not reported.
2. Documentation of the attribute extraction/derivation procedures was incomplete. 3. Documentation of quality control and quality assurance was inadequate and while
the use of field plots established to support development of the knowledge base as a test for the quality of primary or derived input data was commendable, the lack of documentation on sampling strategy and design, the lack of documenta-tion on attribute derivadocumenta-tion and extracdocumenta-tion procedures, the lack of documentadocumenta-tion on the nature of mapping entities, map entities (spatial accuracy and thematic resolution and accuracy) limited the utility of the information.
4. Despite relatively high error values in those attributes tested against field data, there was no discussion of the sensitivity of ecosystem prediction to errors in the input data. Failure to provide a simple error analysis identifying the impact of indi-vidual and cumulative attribute errors on predictive error significantly reduces the value of this data.
5. The knowledge base was not adequately documented. The attributes used in the knowledge base were not fully defined, and the interpretive logic of the attributes used in the predictive process was inadequately described.
6. The knowledge base was not evaluated against an independent data set. Most disturbing is that the authors do not seem to understand the problem of overtrain-ing. Overtraining occurs when a knowledge base is calibrated too tightly to a se-lected data set and then fails because relationships coded were specific to the given data set not the population being predicted. The predictive accuracy then falls significantly when applied to an independent data set.
The short comings make evaluation of the quality and application of the input data dif-ficult. Furthermore, the lack of adequately documentation on attribute extraction
meth-methods and knowledge base structure, values and logic makes assessment of the results impossible and precludes incorporation into a provincial knowledge base.
1.
Introduction
1.1. Background
The Ministry of Sustainable Resource Management commissioned this report as part of a project to evaluate, against the PEM Inventory Standard, Input Data Quality Re-ports submitted in support of PEM projects. Based on these reviews, the contractor was to prepare and report on a framework and format for future Input Data Quality Reports.
The specification for a PEM standard, against which the IDQ Report is evaluated arose, from four preceding works commissioned by the Terrestrial Ecosystem Map-ping Alternatives Task Force of the Resources Inventory Committee (RIC). These re-ports contained important background information and concepts which for the sake of brevity was not included in the standard but was referenced in the expectation that it would be read. These reports were:
1. Towards the Establishment of a Predictive Ecosystem Mapping Standard: A White Paper, by Keith Jones, R. Keith Jones & Associates; Del Meidinger, BC Ministry of Forests, Research Branch; Dave Clark, BC Ministry of Environment Lands and Parks, Resources Inventory Branch; and Fern Schultz, BC Ministry of Forests, Resources Inventory Branch.
2. Problem Analysis on Data Quality Assessment Issues by Dr. David Moon, CDT– Core Decision Technologies Inc.
3. Situation Analysis for Knowledge-Based Systems by Dr. David Moon, CDT–Core Decision Technologies Inc.
4. Problem Analysis on Reliability, Quality Control and Validation of Predictive Eco-system Mapping (PEM) by Dr. Richard Sims and Jeff Matheson, R.A. Sims & As-sociates.
The standard drew upon four additional reports: 1. Specifications for PEM, version 2.1
2. Mapping entities, draft report
3. Protocol for Quality Assurance and Accuracy Assessment of Ecosystem Maps 4. A Method for Large Scale Biogeoclimatic Mapping in British Columbia.
Because PEM was new and largely untested, the original report was somewhat gen-eral and emphasized documentation of data and procedure rather than prescription. The intent was to ensure that a qualified PEM practitioner would be able to evaluate the quality of the input data and procedures used in the production of a PEM map based on the documentation.
In reality, the standard and its referenced reports proved an intimidating, nebulous, time consuming, and open-ended basis for developing IDQ Reports. The variability in quality and format of the reviewed reports is reflective of this reality.
1.2. Rationale for the Review
The review evaluates the submitted Input Data Quality Report in terms of both the let-ter and the intent of the Predictive Ecosystem Mapping Inventory Standard for Input Data Quality Assessment. It evaluates organization, completeness, and adequacy of documentation from the perspective of an inventory specialist evaluating the quality of the input data and the adequacy of the procedures used to produce the PEM. It does not evaluate the adequacy of the report as a contract deliverable and it does not evaluate the quality of the data or procedures used.
The original PEM Inventory Standard was neither prescriptive nor detailed and many of the items in the standard were recommendations rather than requirements. In addi-tion, the original standard had not been tested. Finally, the original standard was not presented or intended as a template for reporting. It is therefore not surprising that submissions to date have been highly variable in terms of both format and complete-ness. The review format has therefore attempted to bring a standard format to the re-view and by extension provide an initial template for future IDQ reports.
1.3. Scope of the Review
The review deals only with input data quality reporting, documentation of input data layer processing, and documentation of the knowledge base. It does not evaluate the adequacy of the data or procedures used, and it does not evaluate the conclusions presented in the report. The review evaluates only whether or not there is sufficient in-formation presented for a knowledgeable PEM practitioner to evaluate the adequacy or efficacy of the input data and approach used.
1.4. Organization of the review
The organization of this review attempts to evaluate and report the elements of the PEM process in the logical order in which they are preformed and represents a devia-tion from the order of presentadevia-tion in the standard.
1.4.1. Introduction
The introduction to the review presents: 1. Background to the review.
2. Rationale for the review.
3. The organization of the review (this section).
1.4.2. The review proper
The review proper has the following major elements:
1. The organization, presentation, and completeness of the report. 2. The individual input data layers.
4. The knowledge base, and validation of the knowledge base.
1.4.3. Conclusions
A general conclusion as to how well the IDQ report meets the letter and intent of the standard.
1.5. Elements of the Review
1.5.1. Organization and Presentation
The review will evaluate the report for organization and presentation; for ease of use and retrieval of information; for format and presentation of report elements including supporting data, use of tables and figures; and for appropriate summary and conclu-sions (specifically are they present and supported by the content of the report).
1.5.2. Field Sampling and Field Data
The standard has no requirement for field sampling however the intent of the standard requires the following documentation for both third party field data and field data col-lected by the contractor in support of the project:
The development of the knowledge base is based in part on field data where the at-tributes being used in the predictive process are collected at locations where the site-series is known. These sites may be collected by the PEM contractor in support of the project, by a third party in support of non-PEM activities, or the contractor may choose a combination of both. Whatever option is chosen, the IDQ report should include the following documentation.
1. The kind, frequency, and distribution of field samples.
2. The method of sample selection (e.g., random, stratified random, selective/modal, etc.) and if stratified or selective the stratification or selection criteria.
3. Quality control and Quality Assurance protocols applied to the field data.
1.5.3. Input Data Layers
Compilation Metadata
Input map metadata provides basic information to ensure that the nature and limita-tions of input maps are understood before creating PEM input data layers. Most im-portant of these are the original base map, projection, and methods used for compila-tion of the map and the history and nature of changes to the original map. Some con-sultants are unaware that TRIM data is compiled and distributed in Albers projection while forest cover and other maps use Universal Transverse Mercator. If obtained in digital form the difference will be obvious when processing is attempted. However if paper maps are digitized by the contractor and the difference is not known, it is prob-able the map will be digitized using the wrong projection. This difference could pro-duce significant positional discrepancies in larger project areas. The metadata should include the process used to match projections, the magnitude of spatial shifts (rubber
sheeting) required during the conversion, the verification of thematic boundary accu-racy (if any) and the nature of any thematic content changes during the update. Spatial Quality of Input Data
Spatial Data Integrity
Reconciliation to TRIM
The spatial integrity metadata should include the method used to match projections between input maps or confirmation of that the conversion was done. The standard established a procedure for evaluating the consistency of the input data map with TRIM features, particularly hydrography. The standard is inadequate with any maps “retrofitted” to TRIM features (e.g., forest cover) unless the procedures and degree of spatial adjustment required to get conformance of TRIM features is recorded.
Spatial integrity:
The standard requires a measure of error for lines which fail to join at map sheet boundaries and for label consistency for polygons crossing map sheet boundaries. Spatial Accuracy of thematic boundaries:
A more appropriate measure of spatial data integrity is the measurement of geo-graphic coordinates for non-TRIM features such as cultural features which are com-mon to both maps (e.g., cutblocks and roads).
Thematic Data Input Quality
The standard requires that for all input data layers, the suitability of the input data be evaluated for use in the PEM project. Only the spatial accuracy of TRIM data may be taken as given and even this refers only to TRIM features (including digital elevation points). It does not refer to thematic accuracy or to the interpolation of or derivation of
landscape attributes such as slope, aspect, shape, slope position et cetera from TRIM
data. The resolution of the TRIM data may not be able to adequately portray the scale of landscape characteristics important to the PEM process. This will be especially true in low relief and complex terrain such as hummocky kame and kettle topography, gla-cial fluvial deposits and others.
Thematic Accuracy
This element, while not required, refers to the validation of thematic attributes by com-paring attributes from know geospatial coordinates to those attributes displayed at those coordinates on the map.
Map Entity Suitability
Mapping Concepts
The standard requires an evaluation for appropriateness of and interpretive issues re-lated to the nature of the mapping entities (things mapped e.g., terrain units) and the nature of the map entities (delineations on the map e.g., simple versus complex com-position). The review will look for demonstrated understanding of the nature of the
mapping concepts used in the input data layers, for an understanding of the interpre-tive issues related to their use for predicinterpre-tive ecosystem mapping, and for tion of how the issues were dealt with. Specifically, the review will look for documenta-tion evaluating the resoludocumenta-tion/complexity of the mapping and map entities, boundary precision for thematic maps, and boundary accuracy for thematic maps. While not specifically required in the standard, these issues are relevant to predictive ecosystem mapping and should be evaluated by competent practitioners.
Resolution/Complexity
In the case of thematic maps, resolution refers to the level of detail attached to the mapping entities used in the input data layer and complexity refers to the number and type of mapping individuals used as components in the map entity. With digital eleva-tion data, resolueleva-tion refers to the precision of the elevaeleva-tion measurement and the spacing of the elevation data points.
Boundary Precision
Boundary precision refers the sharpness of the transition between adjacent map units and/or the confidence with which the mapper has drawn the line. Both will influence the accuracy, precision, and interpretation of any spatial overlay product.
Boundary Accuracy
Boundary accuracy refers to how closely the delineation of a boundary on the map corresponds to its true location on the ground.
Quality Control
The PEM standard did not require reference to Standard Operating Procedures (SOPs) or quality control protocols although elements of QC protocols are inherent in some of the metadata requirements. Despite this, the intent of the PEM Inventory Standard would be best met with a combination of SOPs and quality control proce-dures to ensure that the SOPs were applied appropriately.
Standard Operating Procedures
Standard operating procedures refer to procedures that have been extensively docu-mented and tested to ensure that they produce consistent, reliable results when fol-lowed. They may be widely accepted and adopted, as with ISO standards, or they may simply be internal procedures that have been well documented and tested. The principal behind a SOP is that once tested and documented, it is only necessary to confirm that the procedure was followed and quality assurance testing can be signifi-cantly reduced.
Quality Control Protocols
Quality control protocols refer to application of standard procedures used to ensure the quality and integrity of either data or products, or it can refer to a series of quality assurance procedures to ensure that acceptable levels of quality are being met. In the case of SOPs, the procedure is assumed to produce acceptable quality and the intent is to ensure that the standard procedures have been applied. Examples include pro-cedures to ensure that field data is correctly transferred to digital format or colour map edits to ensure that thematic data is correctly attached to polygons. Quality control protocols should consist of defined procedures for the production or editing of data,
in-formation, or intermediate or final products and a formal sign-off to confirm that the procedure was implemented as documented.
Quality Assurance
Unlike quality control, quality assurance consists of actual testing of the interim and fi-nal products to ensure that acceptable levels of accuracy or reliability are being achieved. When coupled with standard operating procedures, the level and frequency of quality assurance can be significantly reduced.
Knowledge base validation was the only Quality Assurance procedure required for In-put Data Quality Reporting.
Meta Data
The PEM standard establishes minimum levels of documentation and meta-data re-quired to evaluate the quality of input data, predictive procedures, and output products of PEM. The meta-data specified below meet three needs.
1. They provide sufficient information about the nature of the input entities, input data, predictive procedures, and output products for a qualified PEM practitioner to understand the limitations of these items for PEM applications.
2. Their compilation by the PEM practitioner ensures that the practitioner has re-searched the input data and adequately documented the procedures and output products.
3. A longer-term goal of the PEM standard is the eventual integration of PEM/TEM data, information, and knowledge into a single logical data model and repository. The task of integrating TEM with PEM is beyond the scope of this standard but this section will provide the documentation and meta-data necessary to construct such a repository.
Meta data are required in the following areas.
1. Input map source, base, compilation, and map entities. 2. Input map processing including attribute extraction/derivation. 3. Knowledge base and knowledge processing algorithms. Thematic Compilation and Derivation
Attribute Collection/Derivation and Compilation
The derivation and/or compilation of thematic data layers requires the implementation of a set of procedures to either the original data capture and compilation or the evaluation and processing of third party input data. The standard requires documenta-tion of these procedures either by reference to appended descripdocumenta-tions or to published documents.
Wherever possible these references should be to tested Standard Operating Proce-dures described above and should reference the quality control proceProce-dures used to confirm the quality of data or the application of tested Standard Operating Procedures.
Attribute Definition
The standard requires documentation of the attributes used in the PEM process. Ele-ments of this documentation are its definition, domain, scale, and units of measure. The standard provides detailed specifications that should be followed.
1.5.4. Knowledge Base
Documentation of the knowledge base requires the following. 1. The PEM entities being predicted.
2. The attributes being used to predict the entities.
3. The method of compilation of attributes for the spatial entities being predicted. 4. The logic, values, and algorithms used to make the prediction.
The standard provides detailed specifications for documentation of the knowledge base.
2.
Review
2.1. Organization and Presentation
The organization of the report is clear and consistent and makes effective use of ap-propriately placed embedded tables and figures. Input metadata and input data quality for each input data theme are discussed separately from input data processing. Input data processing is discussed in general terms and it is sometimes difficult to deter-mine how individual attributes have been processed. Since input processing and at-tribute derivation/extraction will differ between input data themes, it would be more ef-fective to organize the report either by input data source and then each element of the IDQ report (e.g., metadata, spatial quality, thematic quality, extraction/derivation) or by report element and then by input data theme.
2.2. Field Sampling and Field Data
The project used 137 field samples to verify thematic accuracy for input data layers and derivatives. Results were reported for derived and extracted attributes for each input data theme. While a useful approach the following limitations were noted. 1. The kind, frequency, and distribution of field samples were not documented.. 2. The method of sample selection was not documented and did not therefore,
pro-vide adequate information for a qualified PEM practitioner to evaluate the quality of data.
3. Quality control and Quality Assurance protocols applied to the field data were not documented and therefore limit confidence in the accuracy of the plot data.
2.3. Input Data Layers
2.3.1. Input Data Sources
Compilation Metadata for Input Maps
TRIM: The report included most applicable compilation metadata for TRIM base maps and digital elevation data. However, there is no identification of the compilation format for TRIM as Albers.
FC1: The report failed to provide sufficient compilation metadata for forest cover. There was no discussion of mapping or map entities and the report failed to discuss the history of the FC1 maps, the nature of modifications or revisions, or the implica-tions of the nature of the retrofit to the TRIM base on the accuracy of the forest cover polygon boundaries.
BEC: The report met most of the compilation metadata requirements but failed to ex-plicitly identify the compilation scale or the mapping and map entities.
Bioterrain: The report met much of the compilation metadata but did not specify whether the compilation scale was the same as the publication scale. The report also failed to identify the mapping and map entities used in the project. Since bioterrain dif-fers from the terrain standard this cannot simply be referenced to the provincial stan-dard.
Spatial Quality of Input Maps
TRIM: The PEM Inventory Standard accepts TRIM I and TRIM II as the spatial stan-dard, therefore spatial quality issues were not addressed.
FC1: The report met the standard requirement for comparison to TRIM features and edge match label consistency but did not report edge match line consistency. The re-port also assumed thematic boundary accuracy because the base map conformed to TRIM features but this assumption may not be valid depending on the nature of the retrofit.
BEC: The report fully met the spatial quality data requirements of the PEM Inventory Standard.
Bioterrain: The bioterrain map documentation met the requirement of the standard but provided no evaluation of spatial quality beyond provincial correlation.
Thematic Quality of Input Maps
TRIM: The PEM Inventory Standard assumes the spatial accuracy of TRIM bases and DEM to be adequate. However, the IDQ report did not address the issue of eleva-tion sample density and associate issues of interpolated resolueleva-tion relative to the scale of landscape features and the interpretive needs of the project.
(FC1). There was no discussion of mapping or map entities or their implications for PEM attribute extraction. While not required by the standard, the IDQ report presents quality assurance procedures. Prism sweeps were conducted on 130 plots. The pro-ject compared species volume from field sweeps to polygon attributes of the Forest cover maps. However, the lack of information on sampling strategy, positional accu-racy of field samples, and sensitivity of the PEM interpretive procedures to input errors limits the utility of the data.
BEC: The report identifies that the BEC mapping followed “A Method for Large-scale Biogeoclimatic Mapping in British Columbia, 1999, level 2”. This requires that each rule set for digital elevation modeling be tested at least once. No other quality assur-ance was reported. No information was provided on the nature of the test or tests. Bioterrain: There was no discussion of mapping or map entity suitability, resolu-tion/complexity, or boundary accuracy with respect to PEM inferences. The report cites conformance to provincial Terrain Mapping standards but since bioterrain map-ping, differs from terrain mapping additional documentation should be provided. The report cites field data from 20 randomly selected samples used to test the thematic accuracy of the bioterrain drainage and texture data but does not make clear if these are twenty samples randomly selected from the 130 existing field plots or additional plots sampled at random from the test area. The report provides no measure of posi-tional accuracy of the sample plots any measure of PEM sensitivity to input data er-rors.
2.3.2. Input Data Derivation / Compilation
In general, the IDQ report provides inadequate documentation of the nature of ex-tracted or derived attributes and provides inadequate documentation of the method of extraction or derivation. The information needed to evaluate the extracted or derived data should be available from either standard operating procedures or documented in-house procedures and from quality assurance. These issues are discussed below un-der Quality Control / Quality Assurance.
Quality Control / Quality Assurance
TRIM: The IDQ report provides no reference to Standard Operating Procedures and the in-house procedure describes the derivation of slope and aspect in terms of Arc/Info without any indication of parameterization or any explanation of the logic or nature of the resultant product. For example, it appears that bioterrain polygons were assigned a single mean slope and a single slope/aspect code (inferred from digital data definitions table) but there is no indication of how the single slope aspect code was derived (e.g., the combination of the mean polygon slope and the mean polygon aspect, the dominant slope/aspect combination found in the polygon, etc.). Quality as-surance was conducted using 130 field plots but no sampling strategy or plan was presented, there was no indication of the positional accuracy of the field samples, and there was no presentation of predictive sensitivity to the error levels found.
FC1: The IDQ report provides no reference to standard operating procedures for the extraction and does not refer to in-house procedures. Although some of these can be inferred from tables presented in the report it is time consuming and prone to misin-terpretation. In particular Indicator Species 1 and Indicator Species 2 are not de-scribed. No additional quality assurance beyond that discussed under Thematic Qual-ity of Input Data was reported.
BEC: BEC units were not used in the derivation of or the extraction of any attributes. Bioterrain: The IDQ report provides no reference to standard operating procedures and provides inadequate documentation for the extraction of terrain texture and slope class. For example the digital data definitions report drainage1 and drainage2 as at-tributes but allow up to three terrain components. It is not clear how drainage is ap-plied. No additional quality assurance beyond that discussed under Thematic Quality of Input Data was reported.
Attribute Metadata
Entity Described by extracted attributes
It appears that the entities being described are all cross products of the overlay proc-ess after sliver removal. However it is not clear whether the attribute extraction occurs before or after the overlay process nor is it clear whether some attributes apply to a bioterrain polygon as a whole, the components of the bioterrain polygon, or to the overlay resultants.
Attribute Definitions
What appear to be a key knowledgebase input variables (indicator species 1 and 2) are not defined. Otherwise, Tables 5, 7, and 9 of the IDQ report provide the attributes which constitute a full attribute definition. Tables 5, 7, and 9 conform to the recom-mendation in the standard, however the format of the table would require significant manual entry into a data dictionary before any programmatic correlation could take place. In addition, much of the information must be inferred or retrieved from refer-enced materials. While consistent with both the letter and intent of the Standard, com-pilation of a provincial knowledge base would be facilitated by a standard reporting format.
2.3.3. Input Data Summary Report
Table 1 presents a summary evaluation of each of the components of Input Data Lay-ers for the data layLay-ers used in the report. This review was based on the IDQ report alone. It is time constrained and may contain errors of misinterpretation, omission, or commission. However, the organization of the report reviewed and the need to psent a common review format for widely divergent IDQ reports made retrieval and re-view of specific elements difficult and error prone. In addition, some unanswered questions or areas of ambiguity may have been answerable by review the PEM map products and/or report.
Input Data Sources DEM FC BEC Bioterrain
Compilation Metadata Citation* NA F NA NA Consultant/Department* F F F F Compilation Scale* NA X X X Publication Scale* F F NA F Period of Compilation* F F F F
Original base / projection NA X NA NA
Current base / projection* F P P P Mapping Entities* NA X X X Map Entities* NA X X X Spatial Quality Reconciliation to TRIM* NA M NA NA Spatial Integrity
Edge match lines* NA X F F
Edge match labels* NA F F F
Raster Size* NA NA NA NA
Thematic Boundary Accuracy NA X X X
Thematic Quality Map Entity Suitability* X X NA X Resolution/Complexity X X X X Boundary Accuracy NA X X X Quality Control
Standard Operating Procedures NA NA F F
Monitoring Protocols NA NA M M Quality Assurance Sample strategy/plan NA P X F Positional Accuracy NA X X X PEM Sensitivity NA X X X Accuracy* NA X X F
Input Data Derivation / Compilation Quality Control / Assurance
Standard Operating Procedures* X X NA X
In-house Procedures* I P NA I Quality Assurance Sample Strategy/Plan P X NA X Positional Accuracy P X NA X PEM Sensitivity X X NA X Accuracy M M X F Attribute Metadata Entity Described* X X NA I Data Definition* M I NA I
*Indicates that the feature was identified in the PEM Inventory Standard Codes
NA = Not Applicable
F = Fully Met (meets the intent of the standard) M = Met (meets the letter of the standard)
P = Partially Met (enough information to be useful but is incomplete) I = Inadequate (partially met but insufficient to be useful)
2.4. Knowledge Base and Algorithm Requirements
2.4.1. Entities
The IDQ report does not explicitly define the mapping and map entities being pre-dicted by the PEM process. The mapping entities appear to be the resultants of a bioterrain / forest cover overlay treated as components of the parent bioterrain poly-gon. However, the definition of the PEM digital database shows only two site series deciles. It is unclear how these are derived or what entities are being portrayed on the map.
2.4.2. Knowledge Base Attribute Compilation
Cross Product Resolution
The issue of cross product resultants of the overlay process is discussed but does not explain how the number of cross products is resolved to produce predictive entities. As noted above, it is unclear how the process moves from the overlay process to the prediction of only two site series deciles per terrain polygon.
Sliver adjustment
The method and criteria of sliver removal are adequately explained.
2.4.3. Belief Matrices
The IDQ report states that a belief matrix approach was used in the predictive process and Table 7 gives definitions of the attributes used to infer site series. Table 8 is said to present the logic applying attribute belief values to the predictive process but makes no reference to the entities being predicted. The report fails to identify whether Table 8 refers to a single site series and the structure of the belief matrix and interpretive algo-rithm is unclear. It is possible that the consultant considered the knowledge base to be proprietary and successfully obfuscated their approach. Unfortunately, they also made it impossible to evaluate the approach used.
Sensitivity Analysis
This section was not discussed in the PEM Inventory Standard and the IDQ report did not discuss the sensitivity of prediction to errors in input data and identify whether at-tribute errors interact as cumulative, compensating, or random in their effect.
2.4.4. Knowledge Base Validation
Test Description
The discussion of knowledge base validation is confusing and does not appear to conform to the PEM Inventory Standard requirement for independent validation of the knowledge base. The rationale presented is unconvincing.
Test Results
3.
Conclusions
The report does not meet fully the letter or the intent of the PEM Inventory Standard for Input Data Quality Reporting.
1. Documentation of metadata for the input data layers while conforming largely to the letter of the standard was incomplete. Of special concern was that the history or pedigree of Forest Cover map revisions, base map conversion, and retrofit pro-cedures were ignored. Also of concern is the failure to explicitly recognize the TRIM original projection is Albers and that conversion to UTM was done appro-priately.
2. Documentation of the attribute extraction/derivation procedures was incomplete. 3. Documentation of quality control and quality assurance was inadequate and while
the use of field plots established to support development of the knowledge base as a test for the quality of primary or derived input data was commendable, failure to document sampling strategy and design, failure to document attribute deriva-tion and extracderiva-tion procedures, failure to evaluate the nature of mapping entities, map entities (spatial accuracy and thematic resolution and accuracy) limited the utility of the information.
4. Despite relatively high error values in those attributes tested against field data, there was no discussion of the sensitivity of ecosystem prediction to errors in the input data. Failure to provide a simple error analysis identifying the impact of indi-vidual and cumulative attribute errors on predictive error significantly reduces the value of this data.
5. The knowledge base was not adequately documented. The attributes used in the knowledge base were not fully defined, and the interpretive logic of the attributes used in the predictive process were inadequately described.
6. The knowledge base was not evaluated against an independent data set. The short comings make evaluation of the quality and application of the input data dif-ficult. Furthermore, the failure to adequately document attribute extraction methods and knowledge base structure, values and logic makes assessment of the results im-possible and precludes incorporation into a provincial knowledge base.