• No results found

Library & Academic Computing Committee 18 October 2012 PAPER 05. College of Humanities and Social Sciences. Research data storage and management

N/A
N/A
Protected

Academic year: 2021

Share "Library & Academic Computing Committee 18 October 2012 PAPER 05. College of Humanities and Social Sciences. Research data storage and management"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

College of

Humanities and

Social Sciences

(2)

E:\today\LACC\RDS-M (2).docx Created by Fraser Muir

Created on 16/10/2012 13:40:00 Last saved by MURDOCH Cameron Amendment & Authorisation History

Ver Date Changes Name Author

(3)

3 Introduction

At international, national and local levels, there is intense interest in how to manage the rapidly expanding volume and complexity of research data. Concern is both for the shorter term – ensuring competitive advantage through secure and easy-to-use access, and for the longer term – ensuring enduring access and usability to the research community into the future and compliance with legislation. The UK government and research funding bodies are debating with the HE community how best to address this large and complex problem, and have funded various initiatives to explore options (e.g. the United Kingdom Research Data Service, Digital Curation Centre, UK DataArchive).

Most Research Councils now mandate or encourage Data Management Policies and open deposit of data1.

What is clear is that there will be no external solution that will remove from each university the requirement to provide storage and management procedures for the data of its own research activities.

Background

At the University of Edinburgh, first a consultation on computing requirements of the research community was conducted. Key findings of this consultation indicated a need for larger storage space on servers; more robust archiving services; simple, secure and preferably an automatic data back-up service; and a high demand for training and awareness-raising across the University.

A pilot implementation of the JISC Data Audit Framework project was carried out in 2009. The study focused primarily on research data management rather than storage requirements. The findings at Edinburgh were that there was inadequate storage space and lack of clarity about roles and responsibility for research data management by University research staff. The project noted a need for storage and backup procedures including provision for business continuity arrangements. A formal procedure was needed for data transfer when staff and students leave the institution.

Solutions for the University of Edinburgh will only be successful if they come from a partnership of individual researchers, Schools, Colleges and Information Services. Each has expertise and resources that can be brought to bear to the benefit of all. Recognising this, a cross-institution steering group has been formed with representatives from each College, ERI, ISG and EDINA, led by Peter Clarke from the School of Physics and Astronomy. The College CIO represents CHSS on this group.

The resulting review of both research data storage (RDS) and research data management (RDM) concluded that to inform the delivery of centrally managed large scale research data services, an initial small number of research groups access either pilot or planned services over summer 2012. The requirements for research data services were gathered via the two linked working parties (RDS and RDM) which consulted on data storage and data management. The purpose of these pilots is to confirm service direction and gain initial feedback, before moving to wider

1

(4)

consultation – rather than to restate requirements for research data storage or management.

The proposed pilots will address the stated research storage recommendations for a globally accessible cross-platform file store and for accessibility of research data to all virtual collaborators, facilitating extra-institutional collaboration via a filestore and private cloud pilots. Further consultation through the College research Committees will take place following the pilots.

The planned implementation of the proposed research data services is to build a large common storage infrastructure using standards compliant technologies, and then layer data services on top of this common infrastructure.

This College has also been piloting the use of open-access data archive solutions provided through the Edinburgh Datashare facility.

Project strands

The pilot project has a number of strands which are detailed below. Most pilot groups are undertaking an evaluation of a single strand, but several are using two or all of the different components, thus also evaluating the ease of moving data between the services.

Storage

 Capacious, fast, direct access from all desktops; Windows, Macintosh and Linux

 Group/project-based, even if that group or project is only one person  Accessible remotely without the need to first connect via VPN Private cloud facility2

 Sharing outwith the University via an easy to use mechanism

 Read-only and read-write sharing  File synchronisation between devices

including mobile (phone, tablet etc)  Onsite, to satisfy data security

concerns

Data vault

Currently not yet provisioned, and therefore the exact requirements are to be determined, however, it is expected that this would feature:

 Long term storage

 No Curation – data is left as-is in its original format  Dark archive, not externally available

 Snapshots of data at important stages of data lifecycle, e.g. dataset at original capture before manipulation

2

(5)

5

Exact requirements need further clarification and suggestions are welcome.

Data archive3

 Open access, but with the ability to embargo, perhaps permanently, the dataset

 Metadata capture and searching

 Long term curation by colleagues with expertise in data format trends

 Satisfies funders requirements for data deposit, even if national data services reject dataset

Data management

 Advocacy and training or how to manage data effectively, efficiently and securely

 Workflows to move data from one pot to another depending on the requirement, particularly from storage services to archive

 Data management plans to satisfy funders, testing of the DMPOnline tool to facilitate creation of compliant plans4

 Data security and information governance considerations for datasets that are considered particularly sensitive, e.g. patient data

 Application of the data lifecycle model5

CHSS pilot participants

A number of pilot groups and projects have been identified within the College and researchers have been providing input to various strands over the summer period. These will continue throughout the project.

 PPLS; Dinka songs archive and dissemination of AHRC project to capture songs and transcriptions from South Sudan

 PPLS; BabyTalk project to capture and analyse large quantities of audio data  Law and SPS; AQMeN project requiring sharing of secure datasets across

three Universities

 HCA; large data storage and sharing of forensic archaeology 3D image data  Divinity; facilitation of research administration

 Business School; general testing of private cloud mechanism

 ECA; to collate rich-media research outputs and provide mechanism for sharing and dissemination

The pilot groups were chosen less for the size of dataset (colleagues in Physics for example will be able to test this better), more for their complexity and variety of file formats and requirements to curate. Whilst not exclusive to the humanities and social sciences, our requirements are often quite different to those found elsewhere in the University. Recommendations 3 http://datashare.is.ed.ac.uk 4 https://dmponline.dcc.ac.uk/users/login 5 http://www.dcc.ac.uk/resources/curation-lifecycle-model

(6)

Members are asked to disseminate information about this project within their Schools and follow up with the College CIO as necessary.

In addition, feedback on the services being proposed within each of the project strands is gratefully received, to better steer the project in ultimately meeting our requirements.

Finally, the College CIO would welcome the opportunity to attend any local School research committees to further discuss the project and any local requirements.

Fraser Muir

CIO, College of Humanities and Social Sciences 21 September 2012

References

Related documents