• No results found

Contribution to the Open data strategy - Ester Dzale Yeumo.pdf

N/A
N/A
Protected

Academic year: 2020

Share "Contribution to the Open data strategy - Ester Dzale Yeumo.pdf"

Copied!
14
0
0

Loading.... (view fulltext now)

Full text

(1)

Contribution to the Open data strategy in the

Wheat Initiative

Esther Dzalé Yeumo Kaboré

(2)

2

Context

Coordinate worldwide research efforts in the fields of wheat genetics, genomics, physiology, breeding and agronomy. (Sept. 2011)

Share relevant agricultural data available from G-8 countries with African partners and … develop options for the establishment of a global platform to make reliable agricultural and related information available to African partners. (April 2013)

seeks to support global efforts to make agricultural and nutritionally relevant data available, accessible, and usable for unrestricted use worldwide. (Oct. 2013)

(3)

3

Some Societal challenges …. :

 Feed the world

 Climate change

 Sustainable agriculture

 Health and nutrition

Imply to deal with data :

 Data driven science

 Big data

 Data management, sharing, and re-use

With different point of views :

 Politics

 Technics (IT)

 Scientific Disciplines

 Intellectual property, ethics

 Economics

(4)

4

Status

 Recognized and endorsed by the Research Data Alliance (RDA)

 Part of the Wheat Initiative Information System project

Focus:

 The WG aims to provide a common framework for describing,

representing linking and publishing Wheat data with respect to open standards.

 The WG will focus first on the following data types: SNP, Genomic

annotations, Phenotypes, Genetic Maps, Physical Maps, Germplasm, expression data.

Issues

 Build on existing stuffs

 Keep coherence with existing projects and initiatives as much as

possible

 Adoption

(5)

5

5 stars Linked Open Research Data

*

**

***

****

*****

Publish your data on the Web at a stable URI (whatever

structured format) under an open license

Use non-proprietary formats (e.g., CSV instead of Excel)

Document your data: Provide human-readable documentation (the research context, data collection methods, data preparation, etc.) and basic metadata (creator, publisher, date of creation, last modification, version number, etc.)

When using a in-house vocabulary, make it available via a stable URI, both as a formal file and human-readable documentation, using content negotiation

Link to others by re-using existing vocabularies to name things and their relationships rather than re-inventing. Link out explicitly to external data sources.

Make it easy to find and access

Make it easy to re-use Put it in context Make it easy to understand

(6)

6

Our work plan

What Who

A survey to inventory the assets The Wheat community

Analyse the results and discuss their consequences

The work group + Wheat experts

Produce a report The work group (+ adption groups for the review)

Write a cookbook to provide the Wheat data managers with

guidelines

The work group (+ adption groups for the review)

Identify data interoperability use cases

The work group + adoption groups

A library of linked vocabularies and ontologies

The work group (+ adption groups for the review)

(7)

7

Data formats

Inventory

SNPs Genomic annotations

Phenotypes Genetic maps

Physical maps

(8)

8

Vocabularies and ontologies

 Gene Ontology

 Sequence Ontology

 Plant Ontology/Anatomy

 Plant Ontology/development stage

 Trait Ontology

 Crop Ontology (Wheat Trait Ontology)

 Project specific trait ontologies (Drops, agronomics)

 others?

(9)

9

Metadata standards

 Darwin Core

 Dublin Core

 MAGE

 MINSEQE (Minimum Information about a high-throughput SeQuencing

Experiment)

 Others?

Practices

 Data storage?

 Data policy?

 Guidelines for data management?

(10)

10

Report of the survey

A cookbook

 What kind of entities and relationships are involved in describing and

accessing Wheat data => ontologies

 What properties should be considered for publishing meaningful/useful

LOD-ready Wheat data

 What controlled vocabularies terms are appropriate in any given property

when producing LOD-ready Wheat data

A library of linked vocabularies and ontologies

A prototype that demonstrate the gain of interoperability

(11)

11

Configuration

 What data?

 Where the data comes from?

 How the data sources are connected?

 How the data will be integrated?

 To what ontologies/models the newly imported data will be matched?

Data import using flexible connectors for csv, xml, sql,

sparql, rdf, etc.

Abstract to semantic environment using advanced

data/ontologies selectors

COEUS: rapid build of knowledge bases

(12)

12

A data integration tool

 Developed by the Information institute, University of south California

 quick and easy data integration from a variety of data sources including

databases, spreadsheets, delimited text files, XML, JSON, KML and Web APIs.

 Karma learns to recognize the mapping of data to ontology classes.

 Many demos and use cases available on the website of Karma here:

http://www.isi.edu/integration/karma/

(13)

13

Be an adoption group

 Provide data interoperability use cases

 Review the outputs and give feedbacks

 Utilize the cookbook

Initial Adopters of Working Group Deliverables:

 The International Wheat Initiative

 The French National Institute for agricultural research (INRA)

 The Food and Agriculture Organization of the United Nations (FAO)

 The International Maize and Wheat Improvement Center (CIMMYT)

The International Wheat Genome Sequencing Consortium

(IWGSC)

The Plant Ontology project

(14)

References

Related documents

Research Methods These are specific techniques, tools or procedures applied to achieve a given objective the strategies, utilized in the collection of data or evidence for analysis

Instead of your project management, documented using data itself, accessible displays whether particular data and manage research data file structure.. University will foster a

Integration and Data Collection Storage for All of your Data (Structured or Unstructured).

А для того, щоб така системна організація інформаційного забезпечення управління існувала необхідно додержуватися наступних принципів:

I Goal: Give a flavor for the theoretical results and techniques from the 100’s of papers on the design and analysis of stream algorithms.. “When we abstract away

Since queries expressed in our query language are translated into a query plan composed of operators from our stream algebra, the se- mantics of a query results from the semantics

• Our thorough experimental studies on synthetic and real-world data streams (i) show that adjustments to window sizes and time granularities are suitable for adaptive resource

For workloads involving hybrid queries, disk access is a more significant bottleneck than the main memory process- ing of live data: dropping data on the network