• No results found

Globus Toolkit 5 (GT5): Introduction of a tool to develop Grid Application and Middleware

N/A
N/A
Protected

Academic year: 2020

Share "Globus Toolkit 5 (GT5): Introduction of a tool to develop Grid Application and Middleware"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 7, July 2012)

174

Globus Toolkit 5 (GT5): Introduction of a tool to develop

Grid Application and Middleware

Milan K. Vachhani

1

, Dr. Kishor H. Atkotiya

2

1Assistant Professor at MCA Department, B. H. Gardi College of Eng. & Tech., Rajkot, Gujarat, India. 2Associate Professor & Head, Computer Science, J. H. Bhalodia Women’s College, Rajkot, Gujarat, India.

Abstract This paper resents the introduction of Globus Toolkit 5. Latest version of Globus Tookit is version 5 (GT 5). Grid computing is the latest technology in 21st century. There are many tools to develop application in grid computing, GT5 is the one of the tool using which you can create application in grid computing or middleware of grid computing architecture. Globus Toolkit is a set of libraries and programs that address common problems that occur when building distributed system services and applications. This paper mainly describes the introduction of GT5 and components of GT5. Components of GT5 are Data Management, Jobs management, Security and Common Runtime

Keywords — CRL, GRAM, Grid Computing, GridFTP, GSI C, GSI-OpenSSH, GUI, MyProxy, PKI, RFT, RSL, SimpleCA, XIO

I. INTRODUCTION

Globus [1] is:

 A community of users and developers who

collaborate on the use and development of open source software, and associated documentation, for distributed computing and resource federation.

 The software itself—the Globus Toolkit: a set of

libraries and programs that address common problems that occur when building distributed system services and applications.

 The infrastructure that supports this community—

code repositories, email lists, problem tracking system, and so forth.

The software itself provides a variety of components and capabilities, including the following:

 A set of service implementations focused on

infrastructure management.

 A powerful standards-based security infrastructure.

 Tools for building new Web services, in Java, C,

and Python.

 Both client APIs (in different languages) and

command line programs for accessing these various services and capabilities.

 Detailed documentation on these various

components, their interfaces, and how they can be used to build applications.

These components in turn enable a rich ecosystem of components and tools that build on, or interoperate with, GT components—and a wide variety of applications in many domains. From our experiences and the experiences of others in developing and using these tools and applications, we identify commonly used design patterns or solutions, knowledge of which can facilitate the construction of new applications.

Grid computing is the latest technology in 21st century. There are many tools to develop application in grid computing, GT5 is the one of the tool using which you can create application in grid computing or middleware of grid computing architecture.

The open source Globus Toolkit [2, 3] is a fundamental enabling technology for the "Grid". It provides facility to share computing power, databases, and other tools securely online across industry,

institutional, and geographic boundaries without

sacrificing personal freedom. It contains software services and libraries for resource monitoring, discovery, and management, and in addition security and file management.

II. COMPONENTS OF GT5.2

 Data Management

 GridFTP

 Jobs Management

 GRAM5

 Security

 GSI C

 MyProxy

 GSI-OpenSSH

 SimpleCA

 Common Runtime

 XIO

(2)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 7, July 2012)

[image:2.595.51.286.134.312.2]

175

Fig. 1: Globus Toolkit Version 5 (GT5)

Let’s explain all above components. (A) GridFTP:

The GridFTP protocol[4] was defined to make the transport of data secure, reliable, and efficient for these distributed science collaborations. The GridFTP protocol extends the standard File Transfer Protocol (FTP) with useful features such as Grid Security Infrastructure (GSI) security, increased reliability via restart markers, highperformance data transfer using striping and parallel streams, and support for third-party transfer between GridFTP servers.

One of the foundational issues in HPC computing is the ability to move large (multi Gigabyte, and even Terabyte), file-based data sets between sites. Simple file transfer mechanisms such as FTP and SCP are not sufficient either from a reliability or performance perspective. GridFTP extends the standard FTP protocol to provide a high-performance, secure, reliable protocol for bulk data transfer.

GridFTP Protocol:

GridFTP is a protocol defined by Global Grid Forum Recommendation GFD.020, RFC 959, RFC 2228, RFC 2389, and a draft before the IETF FTP working group. Key features include:

 Performance - GridFTP protocol supports using

parallel TCP streams and multi-node transfers to achieve high performance.

 Checkpointing - GridFTP protocol requires that

the server send restart markers (checkpoint) to the client.

 Third-party transfers - The FTP protocol on which

GridFTP is based separates control and data channels, enabling third-party transfers, that is, the transfer of data between two end hosts, mediated by a third host.

 Security - Provides strong security on both control

and data channels. Control channel is encrypted by default. Data channel is authenticated by default with optional integrity protection and encryption.

Globus Implementation of GridFTP:

The GridFTP protocol provides for the secure, robust, fast and efficient transfer of (especially bulk) data. The Globus Toolkit provides the most commonly used implementation of that protocol, though others do exist (primarily tied to proprietary internal systems).

The Globus Toolkit provides:

 a server implementation called

globus-gridftp-server,

 a scriptable command line client called

globus-url-copy, and

 a set of development libraries for custom clients.

While the Globus Toolkit does not provide a client with Graphical User Interface (GUI), Globus Online provides a web GUI for GridFTP data movement.

GridFTP Clients:

Globus Online1 is the recommended interface to move data to and from GridFTP servers. Globus Online provides a web GUI, command line interface and a REST API for GridFTP data movement. It provides automatic fault recovery and automatic tuning of optimization parameters to achieve high performance.

The Globus Toolkit provides a GridFTP client called globus-url-copy, a command line interface, suitable for scripting. For example, the following command:

globus-url-copy

gsiftp://remote.host.edu/path/to/file file:///path/on/local/host

Would transfer a file from a remote host to the locally accessible path specified in the second URL.

Finally, if you wish to add access to files stored behind GridFTP servers, or you need custom client functionality, you can use our very powerful client library to develop custom client functionality.

(B) GRAM5:

Globus implements the Grid Resource Allocation and Management (GRAM5) [5] service to provide initiation,

monitoring, management, scheduling, and/or

coordination of remote computations. In order to address issues such as data staging, delegation of proxy credentials, and job monitoring and management, the GRAM server is deployed along with Delegation and Reliable File Transfer (RFT) servers.

(3)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 7, July 2012)

176

To achieve this, GRAM provides various

interfaces/adapters to communicate with local resource schedulers (e.g., Condor, PBS, LSF) in their native messaging formats. The job details to GRAM are specified using an XML-based job description language, known as Resource Specification Language (RSL).

RSL provides syntax consisting of attribute-value pairs for describing resources required for a job, including memory requirements, number of CPU’s needed etc. GSI C, MyProxy and GSI-OpenSSH for Grid Security:

These components establish the identity of users or services (authentication), protect communications, and determine who is allowed to perform what actions (authorization), as well as manage user credentials.

(C) Grid Security Infrastructure in C (GSI C):

The Globus Toolkit GSI C component provides APIs and tools for authentication, authorization and certificate management. The authentication API is built using Public Key Infrastructure (PKI) technologies, e.g. X.509 Certificates and TLS.

In addition to authentication it features a delegation mechanism based upon X.509 Proxy Certificates. Authorization support takes the form of a couple of APIs. The first provides a generic authorization API that allows callouts to perform access control based on the client's credentials (i.e. the X.509 certificate chain). The second provides a simple access control list that maps authorized remote entities to local (system) user names. The second mechanism also provides callouts that allow third parties to override the default behavior and is currently used in the Gatekeeper and GridFTP servers. In addition to the above there are various lower level APIs and tools for managing, discovering and querying certificates.

GSI uses public key cryptography (also known as asymmetric cryptography) as the basis for its functionality. Many of the terms and concepts used in this description of GSI come from its use of public key cryptography.

The primary motivations behind GSI are:

 The need for secure communication (authenticated

and perhaps confidential) between elements of a computational Grid.

 The need to support security across organizational

boundaries, thus prohibiting a centrally-managed security system.

 The need to support "single sign-on" for users of

the Grid, including delegation of credentials for computations that involve multiple resources and/or sites.

(D) MyProxy:

MyProxy[6] is open source software for managing X.509 Public Key Infrastructure (PKI) security credentials (certificates and private keys).

MyProxy combines an online credential repository with an online certificate authority to allow users to securely obtain credentials when and where needed.

Users run myproxy-logon to authenticate and obtain

credentials, including trusted CA certificates and Certificate Revocation Lists (CRLs).

While this may sound daunting, the concept (and practice) is straightforward: it’s a service into which you can store X.509 proxy credentials, protected by a passphrase, for later retrieval over the network. Storing credentials in a MyProxy repository eliminates the need for manually copying private key and certificate files between machines. A credential stored in MyProxy can also be accessed at times when the user’s credential would not otherwise be accessible: for example, when the user wants to authenticate to a grid portal from a Web browser, or if a job manager wants to renew the user’s credential and the user is not available.

Grid Portals, based on standard Web technologies, are increasingly used to provide user interfaces for Computational and Data Grids. However, such Grid Portals do not integrate cleanly with existing Grid security systems such as the Grid Security Infrastructure (GSI), due to lack of delegation capabilities in Web security mechanisms. We solve this problem using an online credentials repository system, called MyProxy. MyProxy allows Grid Portals to use the GSI to interact with Grid resources in a standard, secure manner. We required learning the requirements of Grid Portals, overview of the GSI, and demonstrating how MyProxy enables them to function together .

[image:3.595.317.569.522.574.2]

An overview of the MyProxy system[7] is given in following figure 3.

Figure 3: An Overview of MyProxy System

(E) GSI-OpenSSH:

(4)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 7, July 2012)

177

The GSI-OpenSSH distribution provides gsissh, gsiscp, and gsiftp clients that function equivalently to ssh (secure shell), scp (secure copy), and sftp (secure FTP) clients except for the addition of X.509 authentication and delegation

Let’s look at the Overview of OpenSSH. OpenSSH is a free version of the SSH connectivity tools that technical users of the Internet rely on.

Users of telnet, rlogin, and ftp may not realize that their password is transmitted across the Internet unencrypted, but it is. OpenSSH encrypts all traffic

(including passwords) to effectively eliminate

eavesdropping, connection hijacking, and other attacks. Additionally, OpenSSH provides secure tunneling capabilities and several authentication methods, and supports all SSH protocol versions.

(F) SimpleCA:

SimpleCA[10] is a simple certificate authority available for testing purposes. SimpleCA provides a simple implementation of a certification authority which can issue X.509 certificates to Globus Toolkit users and services.

The SimpleCA package provides a certification authority that a user can install and use to issue credentials to Globus Toolkit users and services. This package is meant for use in situations where the user wants public key credentials, for example in order to test GT’s operation, but does not have access to a proper certification authority.

SimpleCA has following features[11]:

 Easy creation of X.509 certificates for use with the

Globus Toolkit

 Easy creation of GPT packages for the created

SimpleCA

SimpleCA depends on the following GT components:

 Non-WS Authentication and Authorization

SimpleCA depends on the following 3rd party software:

 OpenSSL

(G) XIO:

Globus XIO[12] is an extensible input/output library written in C for the Globus Toolkit. It provides a single API (open/close/read/write) that supports multiple wire protocols, with protocol implementations encapsulated as drivers. The XIO drivers distributed with 5.2.0 include TCP, UDP, file, HTTP, GSI, GSSAPI_FTP, TELNET and queuing. In addition, Globus XIO provides a driver development interface for use by protocol developers. This interface allows the developer to concentrate on writing protocol code rather than infrastructure, as XIO provides a framework for error handling, asynchronous message delivery, timeouts, etc. The XIO driver-based approach maximizes the reuse of code by supporting the notion of a driver stack.

XIO drivers can be written as atomic units and stacked on top of one another. This modularization provides maximum flexibility and simplifies the design and evaluation of individual protocols.

The GT eXtensible I/O (XIO) library is used within various GT components, particularly GridFTP, to implement file I/O and communication functions. It should also be of interest to developers of other similar systems.

XIO [13] provides a single POSIX-like API (open/close/read/write) that supports multiple wire protocols, with protocol implementations encapsulated as drivers.

In addition, Globus XIO provides a driver development interface for use by protocol developers. This interface allows the developer to concentrate on writing protocol code rather than infrastructure, as XIO provides a framework for error handling, asynchronous message delivery, timeouts, etc.

The XIO driver-based approach maximizes the reuse of code by supporting the notion of a driver stack. XIO drivers can be written as atomic units and stacked on top of one another. This modularization provides maximum flexibility and simplifies the design and evaluation of individual protocols.

(H) C Common Libraries:

GT 5.2 includes a set of C Common libraries [14] needed for building grid infrastructure. The C Common Libraries provide an abstraction layer for data types, libc system calls, and data structures used throughout the Globus Toolkit and useful for applications that use the Globus Toolkit.

III. CONCLUSIONS

In this era, grid computing is the technology which shares resources in network. Grid computing application or middleware of grid computing model can be developed by using Globul Toolkit. Latest version of Globus Toolkit is version 4 (GT5). GT5 provides components for Data Management, Jobs management, Security and Common Runtime.

REFERENCES

[1] Ian Foster, ―Globus Toolkit Version 4: Software for Service-Oriented Systems‖, LNCS 3779, pp. 2 – 13, 2005, IFIP International Federation for Information Processing 2005 [2] Globus Toolkit - http://en.wikipedia.org/wiki/Globus_Toolkit [3] Globus Project, June 2012, http://www.globus.org

[4] John Bresnahan, Michael Link, Rajkumar Kettimuthu, Ian T. Foster, ―Managed GridFTP‖, IPDPS Workshops 2011: 907-913 [5] Pathak J, Treadwell J, Kumar R, Vitale P, Fraticelli F, "A

Framework for Dynamic Resource Management on the Grid", HP Labs Technical Report, HPL-2005-153, 2005.

(5)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 7, July 2012)

178

[7] Jim Basney, Marty Humphrey, Von Welch, "The MyProxy online credential repository",Software: Practice and Experience 2005; 00:1–17

[8] http://globus.org/toolkit/docs/5.2/5.2.1/gsiopenssh/ [9] http://grid.ncsa.illinois.edu/ssh/

[10] Barry Wilkinson, "Grid Computing: Techniques and Applications", ISBN - 1420069535, 9781420069532

[11] SimpleCA,

http://www.globus.org/toolkit/docs/5.2/5.2.1/simpleca/rn/simpleca ReleaseNotes.pdf

[12] Karl Jeacle, Jon Crowcroft, "A Multicast Transport Driver for Globus XIO", WETICE 2005: 284-289

[13] Rajkumar Kettimuthu, Liu Wantao, Joseph Link and John Bresnahan, "A GridFTP Transport Driver for Globus XIO", Proceedings of the 2008 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2008), July 2008

Figure

Fig. 1:  Globus Toolkit Version 5 (GT5)
Figure 3: An Overview of MyProxy System

References

Related documents

Some of the organizations already confirming their participation include but are not limited to RI Parent Information Network, RIREACH Consumer Assistance, United Way 211, VNS

We discussed how to enhance the modelling and design of mass spectrometry data analysis applications in PROTEUS using ontologies, which combine both data mining and

That is why ultraviolet (UV) and Blue bands are not very useful in remote sensing of the environment. 3) Solar radiation is scattered by the atmosphere, part of it becomes

These topics greatly interest me as a composer, but as I have no underlying statements to make about this across my work, it will only be discussed on a case-by-case basis, when

While the Dealer has the capital to weather the coming change in tax law, both the Sales and Operations Managers question whether many small and medium-sized dealers will be able

A motion for the staff to query the Board members to find a date for an telephone Board meeting in January to discuss the Federations of State Medical Board’s updated opioid

Hispanics have been portrayed in American film as early as the 1890s in silent films up to present day films of all genres, primarily taking on stereotypical roles and/or in a

The current study examined the question, “In what ways do a person’s background in sex education, including family, school, and social learning, impact current efficacy and knowledge