sharing of data
11.2 Recommendations
Previous studies suggest that there are multiple ob- stacles and, hence, no single solution to increase the sharing and archiving of research data. Yet we will present some recommendations here and in fig- ure 11.1.
FIGURE 11.1
Former studies, as well as this analysis, suggest that there is a need for work directed at both the level of researchers, research institutions, research funders and government/international levels.
Initiatives need to take place in parallel. For exam- ple, taking action to make more researchers share data without the proper infrastructure will most likely prove counterproductive. Thus, there is a strong need for a coordinated effort.
We see that the Research Council of Norway can play a key role promoting open access to research data in Norway.
Raising awareness
The sharing and archiving of research data entails many obstacles and questions in which need to be answered. Many respondents were undecided or did not wish to participate in the survey. This might suggest that researcher’s consider sharing and ar- chiving of research data as a complex and difficult topic.
We would suggest that the Research Council of Norway actively work to raise awareness on the is- sue, covering both the benefits and pitfalls of archiv- ing and sharing research data.
In particular, exemplifying potential opportunities and value is important, inter alia, by using best prac- tice cases. Emphasis should be on showing that sharing and archiving is worthwhile for researchers.
In this respect, there also seems to be a need for certainty as to the differences between archiving and open access to research data. The archiving process does not necessarily imply full open access to research data for all - it should be considered a
premise for sharing data. Even if the two matters dif- fer, they are closely linked, and should be seen in relation to one another.
Most researchers use other researchers data. Most researchers are also willing to let others re- use their generated data if certain conditions and restrictions fulfilled.
The researchers are the ones who gather and ana- lyse the data, and who will archive and share the data in the end. Researchers want to know what happens to their research data. As such, it is im- portant to raise awareness among researchers.
However, the study also indicate that researcher need support and many does not see this support from their management. Thus, it is also important to raise awareness at the institutional level.
Giving credit as well as responsibility to research- ers
The study indicates that a lack of incentives and credit for gathering data are a barrier for increased sharing of research data.
These findings correspond findings in former na- tional and international studies (i.e., Kvale, 2012).
The respondents would be more willing to share data if they received credit for their data generation work. One obvious way of crediting researchers would be by support the implementation of a cita- tion or reference system for data. Accreditation is an important motivation for researchers.
“References can be seen as a kind of normative payment”
Ingwersen (2011) Indicators for the Data Usage Index (DUI): an incen- tive for publishing primary biodiversity data through global infor- mation infrastructure
There is no well-established citation system for re- search data in Norway, giving researchers few in- centives to prioritize time for preparation of data for sharing. The lack of a well-established citation sys- tem is also an international issue. We thus see the benefit the such systems should be coordinated at the international level.
Ideally, the system should be easy to use and work alongside existing systems for publishing.
Tenopir (2011) suggests promoting good sharing practices among researchers. For example, obtain- ing copies of articles using a researchers’ data is one example of conditions that would encourage sharing and promoting best practice.
The Council could also introduce some kind of re- quirements on researchers. Lord et al. (2006) study large-scale data sharing in life sciences based on ten case studies, and found that a laissez-faire ap- proach to the collection and distribution of data re- sults in waste, as such data will not entail sufficient information to enable re-use.
A key recommendation from Lord et al. (2006) is an insistence on a data management plan that clearly defines responsibilities and goals and awareness of the needs and practices of data management.
The Research Council can introduce requirement of data management plans as a part of the traditional application procedure.
It is also possible to make sharing of research data a part of the financial system for basic funding. Fur- ther, it could be a system in which the Research
Council of Norway withholds funding until data is properly shared and archived.
We do not recommend implementing such stringent measures at the current stage, as it would require considerable work in terms of its design and in terms of having the proper infrastructure in place. Without proper guidelines and a sound infrastructure, such as system could be counterproductive.
For recent years, research communities has been left to establish methods and practices for sharing and archiving their research data. We are con- cerned that this leads to a suboptimal organization of solutions. As stated earlier, we do not find large differences across sectors or research disciplines. Hence, we cannot support arguments leading to the design of tailored solutions for each specific sector or individual research discipline. Yet the work must still be inclusive of all research communities, as they have the knowledge and will have to implement the supposed strategies and solutions.
Guidelines, rules and best practice
Our study suggests that many researchers lack knowledge as to what data to share and archive. In addition, researchers lack knowledge as to what form the data should have, and how proper infor- mation about the data should be assigned.
Thus, the study suggest a need for better guide- lines, standards and education relating to sharing and archiving research data. Such guidelines and standards should be developed in close interaction with researchers, institutions and legal experts. We recommend that implementation of guidelines and standards should be inspired by work initiated inter- nationally to avoid creating a Norwegian bureau- cracy alongside international standards.
One way of promoting the use of shared data would be by creating solid and informative platform for metadata. A metadata-platform can be a low key ac- tivity, as it can be seen as a first step towards more complex infrastructure solutions. In addition, we perceive that many researchers are not aware of the possibilities of accessing data gathered by other re- searchers. Better metadata can overcome this is- sue.
We would also suggest starting to work on data se- lection (i.e., on defining which data are worthwhile and which are not). Even though our study does not suggest any major differences in the practices and barriers across research disciplines and sectors, the open answers, however, indicated a strong need for better understanding and guidance as to which data to archive and share, and in which form to do so. In particular, researchers who mainly use textual data (e.g., interviews), have difficulties deciding which data to share and preserve.
Infrastructure and funding
Interviews and studies both suggests that the infra- structure for the sharing and archiving of data is fragmented, overlapping and inadequate. Many are satisfied with the current archiving solutions, yet re- searchers seems to archive most of their data on their own institutional servers or local storage de- vices. We found no differences across sectors or re- search disciplines on the topic of storage.
Given the large share of storing data locally, there is clearly a need for better infrastructure solutions. Better infrastructure could increase the motivation for archiving data at data archiving centres, which could provide more secure means for archiving data and data could be restored easier.
Finally, it would lay the ground for increased sharing of research data. Debate on infrastructure invest- ment should involve all relevant stakeholders while ensuring a robust infrastructure that in turn will serve the needs of the future. We are somewhat cautious as to the design and scale of such a sys- tem because it could be a matter of cost and benefit. We thus see that more information on ambition’s is needed.
An ideal data infrastructure for science research would have a long list of technical characteristics. We refer to the wish list included in the EC white paper on scientific data, “Riding the Wave”.
TEXTBOX 11.1
A WISH LIST FOR E-INFRASTRUCTURE
Open deposit, allowing user-community centres to store
data easily
Bit-stream preservation, ensuring that data authenticity will
be guaranteed for a specified number of years
Format and content migration, executing CPU-intensive
transformations on large data sets at the command of the communities
Persistent identification, allowing data centres to register a
huge amount of markers to track the origins and character- istics of the information
Metadata support to allow effective management, use and
understanding
Maintaining proper access rights as the basis of all trust
A variety of access and curation services that will vary be-
tween scientific disciplines and over time
Execution services that allow a large group of researchers
to operate on the stored date
High reliability, so researchers can count on its availability
Regular quality assessment to ensure adherence to all
agreements
Distributed and collaborative authentication, authorisation
and accounting
A high degree of interoperability at format and semantic
level
Adapted from the PARADE (Partnership for Accessing data in
Europe) White Paper (2009)19
19 Partnership for Accessing Data in Europe (PARADE) is a consortium
targeting to build efficient services addressing data management needs of
multiple research communities. Strategy for a European Data Infrastruc- ture (White Paper) was published in October 2009
Ball, A. (2012). ‘How to License Research Data’. DCC How-to Guides. Edinburgh: Digital Curation Centre Borgman Christine L. (2012). The Conundrum of sharing research data. Journal of the American Society for Information Science and Technology, 64 (6): 1059-1078
Committee on Scientific Accomplishments of Earth Observations from Space, National Research Council (2008). Earth Observations from Space: The First 50 Years of Scientific Achievements. The National Acade- mies Press. p. 6. ISBN 0-309-11095-5. Retrieved 2010-11-24.
Creswell, J. W. (2008). Educational Research: Planning, conducting, and evaluating quantitative and qualita- tive research (3rd ed.). Upper Saddle River: Pearson.
EC (2012). Online survey on scientific information in the digital age, http://ec.europa.eu/research/science- society/document_library/pdf_06/survey-on-scientific-information-digital-age_en.pdf
Campbell, E. G. et al. (2002). Data withholding in academic genetics: evidence from a national survey, Journal of the American Medical Association 287, no. 4 (2002): 473–480.
E-science (2005). Large-scale data sharing in the life sciences: Data standards, incentives, barriers and funding models (The “Joint Data Standards Study”), http://www.nesc.ac.uk/technical_papers/UKeS-2006-02.pdf EU (2010). Riding the wave. How Europe can gain from the rising tide of scientific data, http://cordis.eu- ropa.eu/fp7/ict/e-infrastructure/docs/hlg-sdi-report.pdf
Berman, F. & Cerf, V. (2013). Who Will Pay for Public Access to Research Data? http://www.great- plains.net/download/attachments/8486930/SCIENCE2013AUGPAYINGFOROPENACCESS.pdf Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 Version 1.0 11 December 2013 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pi- lot/h2020-hi-oa-pilot-guide_en.pdf
Hanson, Sugden & Alberts, (2012). Making data maximum available, Science 331, no. (11 February 2011). Hey et al. (2009). The Fourth Paradigm Data-Intensive Scientific Di s cover
Ingwersen. P. (2011). Indicators for the Data Usage Index (DUI): an incentive for publishing primary biodiver- sity data through global information infrastructure.
JISC Research 3.0: driving the knowledge economy, http://www.jisc.ac.uk/whatwedo/campaigns/res3.aspx Kowalczyk Stacy, Shankar Kalpana (2011), Data sharing in the sciences. Ann. Rev. Info. Sci. Tech., 45: 247– 294.
Kvale, L. (2012). Data Sharing in the Life Sciences - A Study of Researchers at The Norwegian University of Life Sciences (Masters thesis) https://oda.hio.no/jspui/bitstream/10642/1269/2/Kvale_Live_Handlykken.pdf