3. ROLES AND RESPONSIBILITIES
3.3 Organisational issues in providing repository services
3.4.3 Output-level services 1 Discovery services
Google and other Google-like services (existing and future) need to be factored into a UK national schema. Existing services recognise this and are working to achieve it. The RDN is exploring this possibility, for example, and the Australian national service ADS (ARROW Discovery Service) is talking to Google about its coverage of the ARROW database. As a complication, some discovery services – OAIster is an example – will not accept duplicate entries (i.e. problems arise if a document or its metadata exists in more than one location). National services should broker arrangements between such services and the repositories to save the individual repositories doing so themselves. This brokerage function resides in the ‘technical advisory services’ location in the overall schema.
ePrints UK looked at various aspects of using the full-text in informing metadata creation, with some success, but had difficulty in some cases in accessing the full-text of documents to run searches across. This will be the case, too, where the original object has been deposited in a trusted repository (e.g. PubMed Central) that falls outside the UK network.
40
Subject descriptors were the focus of the HILT projects. HILT I showed that there was consensus in the community that a service that mapped between schemes was preferable to the adoption of a single scheme (Nicholson et al, 2001). The HILT II project has followed this up by developing a set of pilot terminologies as a service for the JISC Information Environment (Nicholson et al, 2005). These have the potential to improve interoperability, not only in the UK but globally, since the mapping approach enables functioning across multiple languages. Subject descriptors, ontologies, classifications, thesauri and related systems have increasing importance in semantic web applications.
The RDN and Higher Education Academy have developed a record interchange format that seems to work well (RLLOMAP). RLLOMAP is likely to be re-merged with its source, UK LOM Core, in time.
Interdisciplinary research in both arts/humanities and the natural sciences places new demands upon discovery services and if these are operating across distinct and separate subject areas interdisciplinary material remains invisible. This is an important issue because interdisciplinary and multidisciplinary research is on the increase and is likely to form a major part of research activity in the future.
3.4.3.2 Personalisation and authentication
Authorisation services will be required for access to and licensing of certain types of data, such as that which requires users to commit to agreements about usage and disposal of the data (e.g. some of UKDA’s data holdings), data that can only be used by certain known parties, and data that the holder does not have copyright for (such as much of the geo-spatial datasets held in the UK where Ordnance Survey data are incorporated).
Personalisation in the form of email alerts has been found to work successfully (i.e. has gained user approval), even if users do generally express a preference for web page- based alerts. The ARROW Discovery Service has developed a daily email alerting system to individuals registered with the service, and sees a spike of usage each day at around 10am, just after the alerts go out20.
CSA (aka Cambridge Scientific Abstracts), a commercial abstracting and indexing service, has expressed some interest in using its Community of Scholars database21 as a name authority system for repositories (MacLeod, personal communication). This represents one possible solution to the problem of name authorisation, though only a partial one since CSA’s database will not have complete coverage of the UK author base. This may be yet another area where learned societies have a role, providing name authority services from their own member databases or digital libraries.
20 Debbie Campbell, ARROW; personal communication 21http://www.csa.com/e_products/COScholars.php
41 3.4.3.3 Publishers
Publishers are already beginning to work with repositories to provide services. The American Physical Society is offering XML-generating services (Kelly, personal
communication). The European Physical Society and the Institute of Physics Publishing are encouraging authors to deposit their articles in arXiv and notify the publisher when this has been done so that the publisher can harvest them from the repository for peer review. Yet other publishers have begun using repository content to develop overlay journals, selecting out articles that fit a profile and bundling them for a specific
readership.
3.4.3.4 Other issues: Research data
Several significant moves have been made on research data recently. These will undoubtedly represent only the vanguard in a thrust to enforce the making of research data accessible to the community for examination, manipulation, mining etc. In other words, data deposition will increase and repositories will need to plan for this:
• The OECD Committee on Science & Technology has recently developed a
Declaration on Access to Research Data from Public Funding, which is now being taken forward by an expert group22. This will finalise a draft text in October 2006
which will be taken to the OECD Council towards the end of 2006
• The journal Nature and the International Committee of Medical Journal Editors have also issued guidelines to their authors about making supporting data freely
accessible in public repositories when articles are published in the journals concerned
• In addition, NIH, NASA, the US Global Change Research programme, the Wellcome Trust, and some of the UK Research Councils have all announced requirements about making data accessible as conditions of grants