4. TECHNICAL MODEL
4.7 Looking to the future
In proposing any technical model there is always a need to consider how it may fair over time. As one interviewee put it the terms technical and long-term don’t sit together very well. Nevertheless it is at the implementation level that this most applies. The aggregation model, whilst addressing the possibilities that a number of different
standards and technologies can provide, is, like to the CORDRA model, not intended to be tied to any specific technology.
Two perspectives can be taken in considering the future development of end-user services across repositories using an aggregation model: the level of take-up amongst other communities and initiatives to assess the breadth of interest; and the potential for the model to meet broader high level views of the technology landscape.
1. Take-up
A number of initiatives making use of aggregations have already been mentioned in this report and the associated appendices. The use of OAI-PMH and its model of data and service providers have driven many of these initiatives in the open access arena. This model has proved successful, and will continue to be with the caveats and
recommendations made in this report.
It is not solely in the use of OAI-PMH, however, that aggregation is regarded as a valuable methodology, but in the use cases that have been identified for aggregation. The Research Information Network is investigating the use of data webs, digital
information and storage following a lightweight harvesting of metadata about datasets into a central registry (an aggregator), to facilitate access to and awareness of these datasets58. They are especially interested in the use of lightweight Semantic Web and
Web 2.0 approaches to enable this. The JISC-TIME project establishing an e-books metadata and interoperability testbed developed an architecture that incorporated the
58 Data webs: new visions for research data on the Web – a Research Information Network workshop,
87
creation of a central aggregation of e-book metadata to support the generation of standard and easily available e-book catalogue records for use in library catalogues59. The OpenCourseWare project in the US is developing a model for openly sharing learning materials60. The project promotes a model of aggregation for the materials to,
currently, make these available through specific websites: they are investigating the aggregation of metadata from these websites to ease discovery across different OCW websites beyond the current web searching that is available.
2. Complementing the technology landscape
All of the examples mentioned have their own detailed technical perspectives on how to implement the aggregations they require for their purposes. One of the most valuable aspects of the aggregating activities indicated in this report is the added value they provide in moving work from individual institutions and organisations up to the network level. They remove the need to manage services at the individual repository level that can be better provided as part of a collaborative aggregation. The aggregations
themselves can then provide services that no individual repository would be capable of by themselves. Those Research Councils with such facilities recognised this when setting up their data centres: AHDS, ESDS, etc. The technical model proposed in this report recommends that this successful approach be extended where possible.
In doing so it is necessary to be flexible to ensure that the aggregations generated do not become millstones but are able to adapt to circumstances. Moving towards a service-oriented approach, as recommended by the e-Framework initiative, allows each of the components involved in supporting aggregations – the repositories, the
aggregators and the end-user services – to be flexibly interchanged as required. This long-term goal is nevertheless worth pursuing to ensure repository content is utilised to the extent it can be.
SOA centres around communication between different components through machine- to-machine interfaces. One of the strongest responses that came out of the interviews for this study was the need for greater communication between different components, though at the human rather than machine level. Improving and standardising how we humanly describe the interactions we would like to establish between different
components will help to define the potential machine interfaces that will allow an SOA environment to communicate for us. The e-Framework initiative to establish common ways for how we communicate, through reference models and related activities, is a valuable step along this road. Improved communication between repositories and aggregators/end-user services will facilitate this in the repository and open access arenas.
Many end-user services today have a personal element to them: they seek to address personal needs. All of us maintain, in more or less organised fashion, a personal collection of digital materials on our computers. We all see the network and the
59 E-Books Metadata and Interoperability Testbed,
http://www.jisc.ac.uk/index.cfm?name=ebooks_metadata
88
information available through it from our personal viewpoint. The personal aspect to future services will be high, and the provision of services across repositories will be no exception. A future challenge will be personal aggregating and how individuals can exploit these aggregations. How can individuals best interact with the different aggregations available to them for personal information management?
Some have advocated the use of Semantic Web technologies to facilitate this, and the use of RDF to describe the information available. Much remains to be understood about the potential of the Semantic Web, though initiatives such as the investigation of data webs will hopefully open up development paths. RDF may also provide the freedom of structure that metadata generation may require: will individuals feel better able to provide useful metadata about resources they are creating if they can provide this in their own way for later mark-up using RDF rather than a set metadata form? Social tagging suggests it is a route that can be popular.
The use of RDF is not simple, however, and an ongoing source of debate will be the balance between lightweight and more complex solutions for achieving interoperability. Lightweight solutions, using OAI-PMH and RSS for example, can draw people in to using interoperable systems, whilst complex solutions such as the DR OSID require greater initial investment for increased potential gain. The former will encourage take- up and needs to be used as a lead in to more detailed and value-added interactions.
88