• No results found

Data management in EGEE

N/A
N/A
Protected

Academic year: 2021

Share "Data management in EGEE"

Copied!
11
0
0

Loading.... (view fulltext now)

Full text

(1)

This content has been downloaded from IOPscience. Please scroll down to see the full text.

Download details:

IP Address: 148.251.235.206

This content was downloaded on 13/11/2015 at 17:32

Please note that terms and conditions apply.

Data management in EGEE

View the table of contents for this issue, or go to the journal homepage for more 2010 J. Phys.: Conf. Ser. 219 062012

(2)

Data Management in EGEE

´

Akos Frohner1, Jean-Philippe Baud1, Rosa Maria Garcia Rioja1, Gilbert Grosdidier2, R´emi Mollon1, David Smith1 and Paolo Tedesco1 1

CERN, Switzerland

2

LAL/IN2P3/CNRS, France

Abstract. Data management is one of the cornerstones in the distributed production computing environment that the EGEE project aims to provide for a e-Science infrastructure.

We have designed and implemented a set of services and client components, addressing the diverse requirements of all user communities. LHC experiments as main users will generate and distribute approximately 15 PB of data per year worldwide using this infrastructure. Another key user community, biomedical projects, have strict security requirements with less emphasis on the volume of data.

We maintain three service groups for grid data management: The Disk Pool Manager (DPM) Storage Element (with more than 100 instances deployed world-wide), the LCG File Catalogue (LFC) and the File Transfer Service (FTS) which sustains an aggregated transfer rate of 1.5GB/sec. They are complemented by individual client components and also tools which help coordinating more complex uses cases with multiple services (GFAL-client, lcg util, eds-cli).

In this paper we show how these services, keeping clean and standard interfaces among each other, can work together to cover the data flow and how they can be used as individual components to cover diverse requirements. We will also describe areas that we consider for further improvements, both for performance and functionality.

1. Introduction

Data management is one of the cornerstones in the distributed production computing environment that the EGEE project aims to provide for a e-Science infrastructure [1].

This infrastructure currently includes 267 sites over 54 countries with about 114,000 CPU computing resource and more than 20 PB storage in 291 storage elements (SE).

The main customer of this infrastructure is High-Energy Physics community via the LHC Computing Grid project. They share this infrastructure with more than 15 application domains, like Life Sciences, Computational Chemistry and Earth Sciences, grouping their 16000 users into 150 virtual organizations (VO).

The EGEE grid infrastructure consists of a set of middleware services deployed on the worldwide infrastructure. Most of these software components are provided by the project’s own development efforts under the name of gLite middleware[2].

In Section 2 we introduce the data management software stack of the EGEE project and in Sections 3-9 we describe the components provided by our team.

Section 10 describes in more detail the typical use cases from a user’s perspective and how they are implemented by the data management components.

c

(3)

2. EGEE Data Management

The EGEE data management components can be classified in three main categories:

• Storage Elements, as the foundation of the infrastructure

• Higher level services to cover use cases over the basic file access and

• Client tools, which interact with these services and provide simpler interfaces to the users.

Figure 1. EGEE Software Stack

Storage Elements are the most fundamental data management services as they are storing the files on disk and tape. They provide access to these files via a wide variety of protocols, such as: gridftp, rfio, dcap, xrootd, HTTP(S) and NFS. They also provide a management interface for the storage resources via specialized protocols and the standardized SRM[3] interface.

In the EGEE infrastructure there are a number of SEs used: DPM, Castor, dCache, StoRM and BestMan.

Higher level services provide functionality over the basic file access use cases:

• File catalogs enable the users to find their files among the storage elements. They manage the SE independent namespace oflogical file names (LFN), which are mapped to a number of replicas, files in the storage elements.

• Reliable file transfer services take care of file replication among storage elements by hiding the details of resource negotiations and error handling from the users.

• Keystore service provides encryption keys for encrypted files in the storage elements.

Client tools provide application programming and command line interfaces for the above mentioned services with a layer of convenience libraries.

One of these abstractions is thePOSIX style I/O API over the file access protocols provided by the storage elements. The goal of such a library is that application programmers do not need to be able to handle all protocols, but write their code against a single API, which will handle the differences.

There are also complex operations, such as uploading a file to a storage element and registering it with a file catalog, which are typical to many applications. The implementation of these typical use cases is provided as a convenience library.

(4)

3. Overview

We have designed and implemented a set of services and client components, addressing the diverse requirements of all user communities.

Figure 2. gLite data management components

We maintain three service groups for grid data management: The Disk Pool Manager Storage Element (DPM, see Section 4), the LCG File Catalogue (LFC, see Section 5), and the File Transfer Service (FTS, see Section 6).

They are complemented by individual client components and also tools which help coordinating more complex uses cases with multiple services for average clients (GFAL-client, see Section 7 and lcg util, see Section 8) and for clients requiring encrypted files (eds-cli, see Section 9). Most of these clients are installed on all the worker nodes of the EGEE, OSG and NorduGrid infrastructures.

There are other services, which are not maintained by our team, but used by our services and tools, such as the BDII Information System or other Storage Elements (Castor, dCache, StoRM, BeStMan).

3.1. Grid Security

The gLite data management components follow the model of other grid services for authentication and authorization. They use X509 certificates to mutually authenticate clients and services to each other based on trusted Certificate Authorities, which are managed by the International Grid Trust Federation (IGTF).

The authorization decisions are based on authenticated individuals and grouping of these entities. We use the Virtual Organization Membership Service (VOMS)[4] to provide the grouping information of X509 entities by the form of Fully Qualified Attribute Names (FQAN). Otherwise the access control information about a managed object (e.g. file, directory, channel, transfer job) is stored inside a service’s database.

3.2. Information System

The EGEE gLite services publish information about themselves to the information system, BDII, so that the client tools can locate them. The information is currently published according to the GLUE 1.3 schema, however it is planned to be extended to the new GLUE 2.0 schema as well.

(5)

The published information, such as service endpoint, are retrieved from the BDII using the LDAP protocol. The FTS clients and services could use other information systems as well via theService Discovery layer.

3.3. Storage Protocols

If it is otherwise not stated the clients and services use the Storage Resource Manager (SRM)[3] protocol to manage the storage elements (SE) and thegridftp (orgsi-ftp) protocol to access the content of files.

4. DPM

The light-weight Disk Pool Manager (DPM) offers a simple solution for a disk-only Storage Element. It is easy to install and configure and requires very low maintenance effort. It is deployed in about 190 sites within EGEE. At one site the DPM is used to manage up to 360TB of data.

DPM consists of a set of services with their own client interface: the Disk Pool Name Server (DPNS) to keep the hierarchical namespace and the access authorizations; the Disk Pool Manager (DPM) to manage the disk space and process the user requests; the Remote File Input Output (RFIO), GridFTP, HTTP(s) and xrootd services to provide access data and the Storage Resource Management (SRM) as standard Web Service interface [3].

Storage management features of this service include pool and space protection; garbage collection of unused replicas and replication of hot files (triggered by the administrator).

For the client DPM provides a client API library in C, Perl and Python (proprietary socket interface to DPNS/DPM/RFIO), a command line interface, and the standard SRM interfaces (v1.1, v2.1 and v2.2).

All the interfaces work only in secure mode using X509 certificates or Kerberos5 tokens. DPM implements POSIX style file and directory authorization, where a user name is an X509 certificate DN or Kerberos principal and groups are taken from VOMS FQANs. The authorization is implemented independent of the underlying operating system by using an internal user and group database.

DPM has a portable codebase, which has been built on a variety of Linux distributions (RedHat Enterprise Linux, Debian), Mac OS X and Solaris.

DPM does not need to contact any external service but static and dynamic information concerning the Storage Element is published in the Information System. As any other SE, DPM is used by applications like GFAL, lcg util and FTS.

5. LFC

The LCG File catalogue (LFC) offers a hierarchical view of files to users, with a UNIX-like client interface. The LFC is deployed at most EGEE European and Asian sites, at some of the OSG sites, in addition to CERN.

The LFC catalogue provides mappings between a Logical File Name (LFN) and Storage URLs (SURL) with POSIX style authorization (see DPM). It supports session based connections to minimize the authentication overhead and transactions for complex modifications.

The LFC relies on a client-server model, using a proprietary socket interface. The LFC server communicates with a database (Oracle, MySQL or PostgreSQL), where all the data is stored. Read-only replicated file catalog can be deployed by using Oracle streams or a fail-over solution with DataGuard.

The LFC server is accessible from the client side by API libraries (POSIX style namespace operations and non-POSIX bulk methods) in C, Perl and Python and by a command line interface.

(6)

Working together with the user communities the basic set of namespace operations have been extended over the past years to provide better performance via more complex operations for bulk deletion, querying attributes of a set of entries and registering files in a single aggregated operation.

6. FTS

The gLite File Transfer Service (FTS) is a data movement service for transferring files between Storage Elements. It was designed to balance site resource usage, prevent network or storage overload, enforce job prioritization, retrying failed transfers and facilitate administration and monitoring of transfers.

The FTS exposes an interface to submit asynchronous bulk requests and performs the transfers using either third-party GridFTP or SRM Copy. These third-party transfers enable FTS to drive transfers in parallel among many disk servers, thus being able to scale up to the limits of the underlying network and reach an aggregated transfer rate of 1.5GB/sec.

The FTS servers are typically deployed at (large) sites where there are large amounts of data to be transferred.

FTS manages the transfers in an unidirectional queue, called channel. A channel is typically defined between two sites and describes the transfer protocol, parameters and resource restrictions.

Figure 3. FTS channels

6.1. FTS architecture

The FTS front-end is a secure Web Service that provides three different port types for submitting requests and monitoring their status, administering and monitoring the channels, and retrieving the FTS usage statistics. The main FTS functionality is provided by a set of daemons, File Transfer Agents, responsible for triggering the third-party transfers (Channel Agent) and applying the VO-specific policies for retrying failed transfers (VO Agents). The WS front-end communicates with the Agents by storing requests into an Oracle database.

Users can access FTS services either via the Web Service API or using the provided client. FTS makes extensive use of the Service Discovery API, for discovering the endpoint and the properties of external services. From the security point of view, the interactions with these external services (mainly SRMs and Storage Elements) always use the client proxy credentials either retrieved from MyProxy or delegated by the delegation components and renewed using the proxy-renewal APIs.

(7)

7. GFAL

The Grid File Access Library (GFAL) is a library that offers to the user a POSIX style API to access data on various flavours of Storage Elements offering (de-facto) standard interfaces. lcg util includes a set of command line tools and libraries that provide higher level functionality on top of the GFAL functionalities.

Here ”POSIX style” means that function prototypes follow the signature of their POSIX equivalents, however with a custom prefix: int gfal open (const char *, int, mode t). With pre-processing macros or libc-preload wrappers it is possible to replace the normal POSIX calls by these functions.

GFAL is currently interfaced to SRM-compliant back-ends (both v1.1 and v2.2) or de-facto standard facades to the massive storage systems such as Castor, dCache or DPM. It provides a common abstraction over these interfaces by using relevant protocols transparently behind the scenes. Using information published in the information system, it resolves relevant abstract domain data/file names so that the physical data access as well as the end-points of services are achievable transparently. It allows and unifies access to various types of items such as: LFN, GUID, SURL, SRM and TURL or local path. In addition, some of the crucial, yet common, backend calls are exposed through the library so that users are not limited to POSIX mapping to do specific calls e.g. to reserve space or pin a file. The pluggable architecture of the library permits optional loading and dynamic change of the versions of some of the supported protocols (i.e. rfio, dcap) without need of redeployment.

8. lcg util

In the lcg util a mixture of GFAL and Globus functionality was used to create Grid equivalents of UNIX copy commands and registration that spans storage systems and file catalogues.

Other functions have been implemented, such as getting the list of file replicas from many sources, performing lookups on LFNs or GUIDs, changing the status of the file or finally removing it from the Grid. The choice of commands implemented was user driven and is often on the first frontier for the Grid user or Grid testers. lcg util provides the C library, command-line programs based on this library and Python bindings for easier integration.

9. Encrypted Data Storage

The Encrypted Data Storage system is made of a set of components on top of the previously described data management infrastructure.

9.1. Encrypted Data Storage Client

The Encrypted Data Storage provides a client-side C library to encrypt and decrypt block level data on the fly. It uses the OpenSSL cryptographic library for the symmetric cryptography routines, thus it can utilize any of the available cipher algorithms, such as the AES cipher. The encryption/decryption keys are stored in the Hydra key store.

The component also provides command line utilities (eds-cli) for managing the keys in a Hydra key store. Other command-line utilities integrate the library with GFAL, thus one can retrieve and decrypt or encrypt and store files transparently.

9.2. Hydra Keystore

The symmetric encryption keys for encrypted data (files) storage are stored in a specific set of servers called ”Hydra”. Hydra provides controlled access to these keys (through certificate DN and VOMS attributes based ACLs) and secured communication to the requester. The Hydra service is a Java Web Service, which can be deployed in a J2EE container, such as Tomcat. It requires a database back-end, and communicates via the HTTPS protocol with its clients.

(8)

In addition, Hydra exploits the Shamir secret-sharing scheme to improve security and reliability of this service. Shamir’s scheme consists of splitting keys into N fragments stored in different places. OnlyM < N fragments are needed to reconstruct a complete key. However, owning less than M key fragments, does not give any information on the complete key. Thus, the system is both resistant to attacks (at least M key stores need to be compromised for an attacker to be able to reconstruct a key) and reliable (the disconnection of a limited number of servers does not prevent the key reconstruction).

9.3. DPM/DICOM Interface

The Digital Imaging and Communications in Medicine (DICOM) is a standard medical image storage system used by Biomedical researchers, an important user community in EGEE. gLite includes an interface between DICOM and the DPM storage element. This DPM/DICOM interface is a plug-in for DPM that takes anonymized medical images from a DICOM system, enters the image metadata into a metadata storage system such as AMGA, uses EDS and Hydra to encrypt the image, stores the resulting file in DPM and stores storage metadata in LFC. These encrypted files may then be analyzed or studied in a Grid environment.

See more of the details at [5]. 10. Use Cases

This section describes the typical data management use cases of the grid middleware.

10.1. Uploading a file

When a client creates a new file on the User Interface or on the Worker Node, then it needs to be uploaded to a storage element to make it accessible across the grid.

Figure 4. Uploading a file

1. Lookup of the storage element endpoint and VO directory using the destination site or SE name in BDII, for examplemyse.cern.ch

2. Generating a storage URL (SURL) locally based on the SE endpoint information, for examplesrm://myse.cern.ch:8443/srm/managerv2?SFN=/cern.ch/dteam/myfile 3. Acquiring a transfer URL for the SURL via the SRM interface of the storage element, for

examplegsiftp://disk101.cern.ch:2811/dteam/myfile-132871 4. Uploading the data to the TURL by gridftp

5. Registering the LFN-SURL pair in the LFC

(9)

The following lcg util command will do all these steps:

lcg-cr -d myse.cern.ch -l lfn:/grid/dteam/myfile /path/to/my/local/file 10.2. Downloading a file

A client wants to download a file to the local file system, knowing its logical file name.

Figure 5. Downloading a file

1. Get a SURL for the LFN by looking up the registered replicas in LFC. If there are more than one replicas, then the client chooses one, preferably from the local storage element. For example for the lfn:/grid/dteam/myfile LFN the LFC may return two SURLs: srm://grid.edu.tw/castor/grid.edu.tw/dteam/myfile

srm://myse.cern.ch/srm/managerv2?SFN=/cern.ch/dteam/myfile 2. Lookup the SE endpoint for the SURL in BDII

3. Acquiring a transfer URL for the SURL via the SRM interface of the storage element, for examplegsiftp://disk145.cern.ch:2811/dteam/myfile-137199

4. Downloading the file from the TURL by gridftp The following lcg util command will do all these steps:

lcg-cp lfn:/grid/dteam/myfile /path/to/my/local/file

The client tool will try to handle error conditions such as the file not being available at the first selected storage element. In this case it will iterate through the prioritized list of replicas and attempt to access them until it manages to download one of them.

The client tool may choose other protocol than gridftp to access the file. For example a worker node may access a local storage element via rfio, dcap or even direct POSIX file open (file).

10.3. Decrypting a file

A variation of the previous use case, when the file in the storage element is encrypted, thus the client needs to get the encryption key and have to decrypt the file locally.

1. Get a SURL for the LFN by looking up the registered replicas in LFC

2. Lookup the SE endpoint for the SURL and the Hydra endpoints (keys are split among multiple services) for the LFN in BDII

(10)

Figure 6. Decrypting a file

3. Get the pieces of the en/decryption key from the Hydra services for the LFN and combine them into a single key

4. Acquiring a transfer URL for the SURL via the SRM interface of the storage element 5. Downloading the file from the TURL

6. Decrypting the downloaded file as it arrives block-by-block The following glite-data-eds-cli command will do all these steps: glite-eds-get lfn:/grid/dtem/myfile /path/to/my/local/file 10.4. Transferring a file

A client can initiate a file transfer between two storage elements directly or can delegate this job to the File Transfer Service, which will schedule and retry this transfer as needed.

Figure 7. Transferring a file using FTS

1. Submitting the transfer job to the FTS.

1.1 FTS acquires transfer URL from the destination SE by its SRM interface 1.2 FTS acquires get the transfer URL from the source SE by its SRM interface 1.3 FTS prepares the transfer on the destination via gridftp

(11)

1.4 FTS initiate the transfer on the source SE via gridftp and follows its progress. If something goes wrong, it will clean up the partially transferred file and retry the transfer

2. Meanwhile the client can poll the FTS for the status of the transfer job glite-transfer-submit \

srm://myse.cern.ch:8443/srm/managerv2?SFN=/foo \ srm://myse.example.org:8443/srm/managerv2?SFN=/foo glite-transfer-status 49ce183b-3fb4-11de-b943-abe14bae4af8 11. Future Directions

The main priorities in our team for the data management components are

Stability focusing on quickly solving problems, providing bug fixes and maintaining backward compatibility in the application programming and command line interfaces.

Reliability improving the error handling in our clients and services against internal failures and also against failures of other services used by the component.

Maintainability providing portable code to be prepared for future platform changes and documenting it internally for anyone participating in the development.

Besides these main objectives we also plan to improve the administrative tools, encapsulating routine procedures into single commands and providing web interfaces to ease the administration. We would like to provide real time monitoring information of services for easier problem determination and to enable proactive administrative changes. And we would also like to automate regular procedures, for example cleanup and archival of old and unused records.

For resource protection we plan to implement a quota system for storage and network bandwidth limits for individual file accesses.

For better integration with user data management frameworks we are looking into simplifying the client libraries and using using messaging instead of polling.

The detailed plans in respect to each individual components are maintained by the LCG Savannah software development portal as bugs[6] and tasks[7].

References

[1] Abadie L, Badino P, Baud J P, Casey J, Frohner A, Grosdidier G, Lemaitre S, Mccance G, Mollon R, Nienartowicz K, Smith D and Tedesco P 2007 Mass Storage Systems and Technologies, IEEE / NASA Goddard Conference on 060–71

[2] EGEE gLite middleware URLhttp://glite.org

[3] Abadie L, Badino P, Baud J P, Corso E, Crawford M, Witt S D, Donno F, Forti A, Frohner A, Fuhrmann P, Grosdidier G, Gu J, Jensen J, Koblitz B, Lemaitre S, Litmaath M, Litvinsev D, Presti G L, Magnoni L, Mkrtchan T, Moibenko A, Mollon R, Natarajan V, Oleynik G, Perelmutov T, Petravick D, Shoshani A, Sim A, Smith D, Sponza M, Tedesco P and Zappi R 2007Mass Storage Systems and Technologies, IEEE / NASA Goddard Conference on 047–59

[4] Alfieri R, Cecchini R, Ciaschini V, Dell’Agnello L, Gianoli A, Spataro F, Bonnassieux F, Broadfoot P, Lowe G, Cornwall L, Jensen J, Kelsey D, Frohner A, Groep D, De Cerff W S, Steenbakkers M, Venekamp G, Kouril D, McNab A I, Mulmo O, Silander M, Hahkala J and Lhorentey K 2003 Managing dynamic user communities in a grid of autonomous resources Tech. Rep. cs.DC/0306004

[5] Montagnat J, Jouvenot D, Pera C, Frohner A, Kunszt P Z, Koblitz B, Santos N and Loomis C 2006 Bridging clinical information systems and grid middleware: a medical data manager Tech. Rep. EGEE-PUB-2006-012 [6] EGEE JRA1 bugs and feature requests URLhttps://savannah.cern.ch/bugs/?group=jra1mdw

Figure

Figure 1. EGEE Software Stack
Figure 2. gLite data management components
Figure 3. FTS channels
Figure 4. Uploading a file
+3

References

Related documents

METHODS: Using population-based cross- sectional and longitudinal data from two nations (Switzerland, 5883 young men; USA, 2174 young men and 2244 young women) we assessed

[r]

19% serve a county. Fourteen per cent of the centers provide service for adjoining states in addition to the states in which they are located; usually these adjoining states have

Francesco Longo (Univ. and INFN Trieste, Italy) Martino Marisaldi (INAF-IASF Bologna, Italy) Sandro Mereghetti (INAF-IASF, Milano, Italy) Kazuhiro Nakazawa

The main optimization of antichain-based algorithms [1] for checking language inclusion of automata over finite alphabets is that product states that are subsets of already

A new method based on integrating discrete wavelet transform and artificial neural networks (WANN) model for daily crude oil price forecasting is proposed.. The discrete Mallat

The City of Chicago, Department of Transportation seeks Task Order Proposals for the scope of services described herein for bike station siting and outreach all in accordance

For example, when you select the Include Date &amp; Time and Include Text options and click Apply, you can see the related information displayed on the live view image when.. you