The currently implemented CEOP-AEGIS Data Portal is based on the Dap- per / DChart distribution 2.2.0. Dapper is a back-end data server based on the OPeNDAP technology and has the capability to stream conformal sci- entific data stored in the NetCDF file format (see section 2.3 on page 56). NetCDF consists of a set of machine-independent data formats and software libraries with associated APIs that were specially designed for the creation, access, and sharing of array-oriented geospatial data, as explained in sec- tion 2.1 on page 14. OPeNDAP is an open-source standard data format and protocol to access remote distributed scientific data over the internet and is explained in detail in section 2.2 on page 39. The front-end of the CEOP- AEGIS Data Interface is a user-friendly and browser based web interface that takes advantage of the OPeNDAP enabled client program DChart (see
section 2.4 on page 62). The combination of NetCDF, OPeNDAP, Dapper and DChart form a powerful combination that allows to reach the objectives of this thesis, as they are defined in section 1.3 on page 8. However, they also need to be understood in order to employ them.
The newly developed CEOP-AEGIS Data Interface implements a NetCDF data model that is considered as adequate for storing final project output data. This data model must however not only be adapted to the needs of CEOP-AEGIS project partners with their specific data, but also to the re- quirements an constraints of the technologies NetCDF, OPeNDAP, Dapper and DChart as well. In consideration of their importance within this work, a profound evaluation of the technical foundations of these four technolo- gies was acquired in chapter 2. Publications of authors that developed these technologies form the basis of the literature survey of this chapter.
Out of the knowledge gained in chapter 2, a NetCDF implementation adapted to the requirements of the evaluated technologies and adapted to the needs of CEOP-AEGIS project data was subsequently derived. The NetCDF implementation for CEOP-AEGIS project data consists of a NetCDF 3 data model with its classic format, conformal to the CF Climate and Forecast con- vention for gridded data and conformal to an adaption of this convention for in-situ data. The adaption of several elements of the CF convention for in- situ data was necessary due to the limited number of supported conventions by Dapper for such kind of data. It consists in an expansion of the prop- erties of the CF convention in order to reach conformance with the in-situ convention of Dapper. The discussions that led to this decision as well as the properties of the NetCDF implementations for both gridded and in-situ data were documented at a full length in section 3.2 of chapter 3 (page 71). The same section contains a documentation of both the CF Climate and Fore- cast convention as well as the Dapper In-situ Convention. NetCDF models and formats with their capabilities and constraints are documented in the previous chapter 2.
The determined NetCDF data models for CEOP-AEGIS were imple- mented as part of a newly developed data interface application, as described in section 3.3 on page 93. This program was written by employing the Python programming language and is completely based on open-source solutions. It follows international standards to convert and aggregate heterogeneous input data of project partners to standardized NetCDF output files. On the basis of multiple standardized project datasets converted by the use of this data interface, several maps and animations were finally produced as part of this thesis (see section 3.4 on page 107). User interaction with the data interface application is achieved on the basis of convenient and flexible command line parsing options that also easily allow to be automatically controlled by the
use of batch files. A XML settings file defines default values in matters of the functionality of the interface application and of the conversion of data as well. All components of the program employ a flexible error logging system to provide logging information to one single file.
Input data of project partners is first converted to a newly introduced in- termediate data model. It consists of three elements that all together form an unit and that must all be consistent to each other: A multidimensional array object saved as numpy file serves as container for numerical data. NetCDF related metadata is stored in the XML document format of the NetCDF Markup Language (NcML). The third element is a XML file to store co- ordinate information that describes the index values of the corresponding multidimensional numpy array.
This standardized intermediate data model strictly divides metadata from data content and easily allows by the use of various Python APIs to modify its elements in order to reach conformance with the requirements of the determined CEOP-AEGIS Data Model for NetCDF. Structured metadata information can also be easily completed by the user, since it is stored in the machine-independent XML file format that is both readable for machines and humans. Nearly any established geospatial raster data format as well as in-situ data stored in a CSV table can be accessed and translated to this intermediate data model within the CEOP-AEGIS Data Interface by taking advantage of Python APIs designed for the translator libraries GDAL, GrADS and CSV.
After a dataset in the intermediate model was made conformal to the determinations of the CEOP-AEGIS Data Model, it can be converted to standardized NetCDF output files by the use of this interface program. By doing so, the application takes advantage of a Python API for the NetCDF library. During the conversion process, several internal checking applications proof if conformity is given. A conformance and consistency check for input data is mandatory so that a NetCDF file can be produced out of the interme- diate data model. Additionally it is possible to employ compliance checks to the CF Climate and Forecast convention for gridded data and to an adaption of this convention in accordance with the Dapper In-situ Convention for in- situ data. The compliance checker for the CF convention is implemented as a component of this application. Following the data conversion process from a time series of input datasets to multiple standardized NetCDF output files, subsequent time series of NetCDF datasets can be optionally aggregated over time in order to obtain one single NetCDF file.
The CEOP-AEGIS Data Portal as well as the newly developed CEOP- AEGIS Data Interface are completely based on open-source solutions. By the use of the CEOP-AEGIS Data Interface and its intermediate data model,
various geospatial datasets can be converted to standardized and aggregated NetCDF output files. Moreover, this application can read, write, check and print standardized NetCDF files as well as data in the intermediate data model in case that conformance is given to the CEOP-AEGIS data model. These features make the CEOP-AEGIS Data Interface a powerful and useful upstream application for CEOP-AEGIS related project data.