• No results found

PIT adoption for Climate Data Management

N/A
N/A
Protected

Academic year: 2021

Share "PIT adoption for Climate Data Management"

Copied!
11
0
0

Loading.... (view fulltext now)

Full text

(1)

PIT adoption for Climate Data

Management

The German Climate Computing Center

(DKRZ)

5th RDA Plenary

(2)

Overview

DKRZ: A climate science service provider

Data services for the international climate

science community

RDA PIT adoption plans

Generic underpinning for future data services:

(3)

DKRZ

A service provider for the german climate (modeling) community

• Non profit company (GmbH) established 1987

• Located in Hamburg, Germany

Balanced HPC / storage system • 3 PFlop Bull system

• 45 PByte Lustre parallel file system • 335 PByte HPSS tape backend

Data Services:

• Long term data archival

• World Data Center for Climate

• Core node in international climate data

(4)

Climate data services

Archiving Data Generation / Ingest Dissemination DKRZ HPC Environment W D C C Long Term Archive Environ ment Data Dissemination / Evaluation Research Data Environment Long-Term Archiving

Interdisciplinary use cases

International climate community (modeling + impact + .. ) National climate modeling community Cross Community Context

ESGF data federation DKRZ

(5)

Climate data services

Archiving Data Generation / Ingest Dissemination DKRZ HPC Environment W D C C Long Term Archive Environ ment Data Dissemination / Evaluation Research Data Environment Long-Term Archiving

Interdisciplinary use cases

International climate community (modeling + impact + .. ) National climate modeling community Cross Community Context

ESGF data federation DKRZ

World Data Center Climate

1

3 2

1

1 2

PID assignment / management Replication and PIT

3

3 Balanced data access and PIT

(6)

DKRZ RDA adoption plans

Early PID assignment in data life cycle:

 PID registration as part of data center ingest process

First steps:

DKRZ is partner in European Persistent

Identifier Consortium (EPIC)

 Operates PID (Handle) server

PID registration for (existing) LTA data

products

ESGF PID integration see large scale

data projects meet RDA session

Archiving Data Generation / Ingest Dissemination DKRZ HPC Environment W D C C Data Dissemination / Evaluation Research Data Environment Long-Term Archiving 1 1 1 1

(7)

PIT adoption: core services

Collection / hierarchy discovery:

climate data sets are organized in collections (e.g. time

series, data + metadata) and hierarchies (e.g. according

to experiment organisation)

Collections are built through PID Information types

(agreement on specific PID metadata elements needed

for collection/hierarchy management)

Generic collections / hierarchy discovery service to be

used in many use cases (e.g. replication, balancing,..)

A) Check service applicability: PIT profile conformance test

B) Get components of collection / hierarchy, get values of properties with specified types

(8)

PIT adoption: replication use case

Data collection replication:

DKRZ acts as a core replication center for data e.g. from large

international climate intercomparison projects (CMIPs)

We are on the way to define generic PID types for collections (coll_t)

as well as replicas (rep_t), will feed experiences back to RDA

 Based on this the next step is to develop a generic replication

service Archiving Data Generation / Ingest Dissemination DKRZ HPC Environment W D C C Long Term Archive Environment Data Dissemination / Evaluation Research Data Environment Long-Term Archiving 2 Replication service:

A) check_service_preconditions (PIT profile conformance check)

B.1) Get parts of collection: PID1  (PIDa,..PIDx) (See collection management service)

B.2) Get access urls and chcksums

(PIDa,..,PIDn)  (URLa,..,URLn), (csum1,..,csumn) C) Replicate/check and create new (replica) collection

(9)

PIT adoption: load balancing use case

High volume data access:

High bandwidth requirements for data access

Exploit replicas for data download

Generic data access service exploiting replicas

Archiving Data Generation / Ingest Dissemination DKRZ HPC Environment W D C C Long Term Archive Environment Data Dissemination / Evaluation Research Data Environment Long-Term Archiving 2

Data access service:

A) user/tool provides either a collection PID or set of PIDs

B) Check applicability: PIT profile conformance check C) Walk over PIDs (in collection) and get replica

locations + chksums

(10)

DKRZ RDA adoption plans

End User services based on PITs and the Type Registry

rich landing pages for collections aggregating PIT metadata info

with external sources (QA records, annotations,..)

early data referencing service for data collections

PIDs to support international climate model intercomparison project

(CMIP) data management

collaboration with ESGF – see „Big data projects meet RDA

session“

DKRZ acts as long term archival and DOI assignment center in

CMIP context, transition PIDs – long term DOIs

 Interested in „data fabric“ concepts etc.

(11)

Thank You

Stephan Kindermann:

[email protected]

References

Related documents

The ERP Central Component (ECC) is the central repository for Master Material, Equipment Master, Customer Master, Vendor Master, and Asset Master data for the LMP, GCSS-Army,

From 1 January 2016 the multinational anti-avoidance law will apply, where it is reasonable to conclude that the division of activities between a non-resident entity that is making

The information on why the interviewee is leaving his or her place of origin, in particular the impact of possible adverse weather and climate conditions on the interviewee’s

The National Home Mortgage Finance Corporation (NHMFC) is a Secondary Mortgage Institution (SMI) with the primary purpose of attracting long term investments through the

The book is reputed to have been purchased by merchants from all over Europe (Favier, 1998), which supports a view that the primary audience for Summa was neither mathematicians

1 of the German securities trading Act (wphG) in conjunction with sections 297 (2) sentence 3 and 315 (1) sentence 6 of the German Commercial Code (hGB) To the best of our

double the price of grain, while overseas transportation amounted between 15 and 25 percent of the purchase price of the grain shipped. 35 For some examples of such activities

If the patient is older than 2 but younger than 46, assemble the appropriately sized nebulizer and fill the medication reservoir with 1 (one) unit dose of albuterol sulfate.. Place