A Practical Application of Archaeological Field Drawing Data using Semantic Web Principles
4.3 The data
4.3.2 The data from Hungate
The data from Hungate was exported from the Integrated Archaeological Database (IADB), which is the bespoke data management system developed by Michael Rains at the York Archaeological Trust (YAT). The data is currently unpublished, and permission was kindly given by YAT for its experimental use as part of this research. The IADB is the in-house system used at YAT, but is also made available for download without cost, and is used by several other academic and commercial archaeological field units (Rains 2011). Much of the development of the IADB has been through Rains’ partnership with the Silchester Town Life project based at the University of Reading. This collaboration has resulted in original work in several areas, including digital data capture (Fisher et al. 2009), virtual research environments (Rains 2007) and experiments in concurrent excavation, analysis and publication (Clarke et al. 2002).
Whereas the Cottam archive represents a complete archaeological dataset conforming to best practice principles using traditional software and methods (and specifically commercial software designed for other disciplines, but adapted for use by archaeologists), the IADB represents a best practice exemplar for a complete data management system, incorporating new technologies and ideas to specifically improve the archaeological research process, while maintaining focus on everyday practical use. Steve Stead and Pete Clark at the Scottish Urban Archaeological Trust (SUAT) developed the initial concept for an integrated
database for archaeology in the late 1980s. When Rains replaced Stead in 1989 he implemented the concept based on other projects he developed for Durham University and (what became) Historic Scotland, and created the IADB. At first the IADB was meant to be a framework for post-excavation analysis, but has become a full virtual research environment. In development for over 20 years, the IADB has been implemented with different programming languages and commercial software over the years, but is currently based entirely on open- source, Web server based solutions, in PHP with MySQL, and SVG for vector graphics (Rains 2011).
As the fundamental design principle of the IADB is that it is integrated, in order to access the data for use in this research, it had to be split into its constituent parts for export. Export options from the IADB for vector drawings include SVG and DXF, and for data held in tables, CSV and SQL. One of the most distinctive features of the IADB is its use of native SVG for all vector drawing. SVG being the W3C XML standard for vector data on the Web, it is also a non-proprietary vector data format now available as an export option across most popular vector- based drawing and spatial programs. With the advent of Internet Explorer 9, SVG is finally supported across all major browsers, which means it also displays natively on the Web in most cases. One of the original development goals for the SVG standard was to create an exchange format for vector data, and as such it seems an ideal archival format for Web and non-Web use alike; AutoDesk’s DXF format being the de facto standard in absence of a true non-proprietary format. While the major CAD, GIS and vector drawing programs now have support for the export of data in SVG format, import support is still lacking, and until this is remedied it cannot be considered archival (Meng 2008, 1019). As momentum around SVG builds however, the IADB will be perfectly placed to take advantage of it.
Because SVG cannot be imported natively into AutoCAD 2008 or ArcMap 9.3.1, the vector plan drawing exported from the IADB in DXF format was
used. Additional data about the contexts was also exported in both CSV and SQL format, including a table containing the stratigraphic relationships between the contexts (limited to ‘later than’). The data in CSV format was used for this research. As only the on-going excavation in Area H2 has yielded data from the Anglo-Scandinavian period of occupation at Hungate, this was the dataset exported by Rains. The dataset should not be considered complete however, as only the data from the deep trench and a small buffer zone surrounding it has been processed and input into the IADB as of 2011. More is waiting to be input, and still more Anglo-Scandinavian material will certainly be found before the excavation is finished. While beyond the scope of this research, far more information is available about the contexts from within the IADB, including finds data, images, and additional documents relating to the site and bibliographic references. The IADB also allows contexts to be grouped together as sets, and sets grouped together within phases for post-excavation analysis (Rains 2011), so the IADB provides rich potential for easy incorporation into the Semantic Web.
To make the Hungate drawing ready for conversion into RDF, it was also tidied and prepared. Using AutoDesk’s AutoCAD 2008, all extraneous information was removed, including the hachures and spot height measurements. Unlike the Cottam drawing, the Hungate drawing consisted entirely of closed polygons, so decisions about how to interpret areas where edges were uncertain or truncated by the limits of the excavation, were made by staff at YAT before the data was exported.
Figure 50: The location of the data from the deep trench from Hungate (in red), as exported from the IADB and projected in GIS.
Figure 51: The data from the deep trench from Hungate, as exported from the IADB, showing each context as a closed polygon.
As with the Cottam drawing, the size of the excavation trench probably did not require assigning a projection to the data, but the CAD drawing was brought into GIS using ESRI’s ArcGIS 9:ArcMap 9.3.1 to georeference and project it. Context numbers were already included in the annotation table as exported from the IADB, but two columns were added to use the ‘calculate geometry’ function to capture coordinates for the x and y centroid of each context, along with calculations for area and perimeter. From this point the process was the same as with the Cottam drawing, with the relevant data from the GIS attribute table being exported as GML, and processed through the STELLARPreloader Java application. The information about each context was then extracted from the GML file, and the additional WKT information for Hungate included: Area, Perimeter, CentroidX, CentroidY and Centroid. The STELLARPreloader then converted the data into CSV format for export into the next phase of transformation into RDF, bringing both drawings to the same point in the workflow process.