190391-IJECS-IJENS © October 2009 IJENS I J E N S
Leveraged Assortment of Microsoft Technologies:
Marching Onwards in the Coming Battle Field of Grid
Computing
Muhammad Mujahid Iqbal
Department of Electrical EngineeringNational University of Computer & Emerging Sciences Pakistan [email protected]
Abstract-- The availability of Powerful microprocessors and high-speed networks as commodity components has enabled high-performance computing on distributed systems (wide -area cluster computing) to be more communally adoptable. In this environment, as the resources are usually distributed geographically at various levels, there is a great challenge in integrating, coordinating and present them as a resources to the user; thus forming a distributed grid. The reactive strategy to high price of supercomputers; disabling researchers accomplish many research projects like weather forecasting, nuclear simulations, bio-informatics and all those applications that require high-performance computing. S o far, various grid-enabled paradigms are developed to carry out large-scale problems in various fields of science, engineering and commerce. Observing the rapid emergence of the grid computing and increasing popularity of the software development platform based on Microsoft’s .NET Framework, a Windows-based grid computing is particularly important from the software industry’s perspective. To leverage the combined power of .NET framework, S QL server and Windows, a phase -shifted grid paradigm is adapted that can harness the power of Microsoft Technologies.
Index Term-- . NET framework, Middleware.
I. INTRODUCTION
The foremost theme of this paper is to point up the idea for collaboration of Microsoft Various software development technologies like .NET architecture. The approach will put the concept a step ahead to develop a collaborative strategy for progress in the .NET grid-aware applications
.
In Asia, very limited research is being carried out; a few educational institutes are seriously working over cluster [1] and grid computing. At the moment CERN is const ructing the large Hodron Collider (LHC) [3] and four huge experiments ALICE, ATLAS, CMS (Compact Moun Solenoid experiment) and LHCb [3],[4]. By the year 2008 there will be an enormous demand of computing which leads CERN to study GRID computing. The larger size of those collaborations, up to 2000 scientists, allows people from weaker countries to join those from stronger regions. Czechoslovakia, India, Pakistan, Russia and Taipei are examples of countries actively participating in setting up their own Grid centers. Pakistan is now building up a high- energy physics community that will participate in CMS experiment. Although these advancements are being
done at the higher level of academia and are of a single nature i.e. concerning with physics yet this is the time we need to aware the academies at lower levels of research to develop programming and architectural advancements to expand for the grid-aware applications.
Today’s PC has evolved into a powerful machine that is proficient enough to deploy missiles yet its true extent has not been exploited and is still underutilized. Many problems require intensive computational work and building a super computer is expensive so a possible solution to this problem is Grids.
Modern day computers are powerful machines. These machines are powerful enough to launch nuclear arsenal yet we hardly ever utilize their true potential. Problem like finding large primes require intensive computation. Constructing supercomputers that are able to complete these intensive computational tasks easily are very expensive. Grid computing provides us with a viable solution to high powered computing.
II. BACKGROUND AND JUSTIFICATION An issue that computer scientist have faced over the years has been that of proper utilization of computational resources and that of parallel computing.
Over the years computer hardware has evolved and today’s desktop PC are capable of deploying Intercontinental Ballistic Missile. But these powerful machines are still underutilized. With the advent of internet most of these machines are now able to communicate with each other. An average user hardly ever utilizes the power of a modern.
Some problems are inherently concurrent in nature .While others require super computing. But there is no suitable computer architecture available that can tackle problems of such nature. Investing in building super computer is an expensive venture. The risks are very high.
A remedy to the entire problem is Grids. The whole idea of Grids was to make computational power as easily accessible as electrical power grids. Grids have provided an economical solution for resource utilization, virtual organization and state of the art computing.
190391-IJECS-IJENS © October 2009 IJENS I J E N S simulation, combining supercomputers to simulate
gravitational fields of a black hole and many more.
As grids have been accepted as a versatile integrated medium to various fields, hundreds of grids are being installed in a variety of scientific and commercial settings.
TABLE I
Famous Grid Computing Projects
Area of Interest Projects
Science
SET I@Home
Analytical Spectroscopy Research Group
Evolutionary Research Climateprediction.com
Distributed Particle Accelerator
Life Science
Folderol Folding@home Gnome@home FightAIDA@home Find-a-Drug
Cryptography
Distributed.net ECCp-109
M athematics
Great Internet Mersenne Prime Search, Roth Prime Search Minimal Equal Sums of Like Powers, GRISK, MM61 Project Pi(x) Project, 3x + 1 Problem Distributed Search for Fermat Number Divisors
PCP@Home
III.
N
ETG
RIDC
OMPUTINGAt present, different grid venders, like Globus, IBM, OGSA etc, have developed their grid-deployment toolkits. But most of them are based on Unix/Linux-class OS environment. This could be beneficial for dedicated clusters of organizations, yet it results in intensively limited ability to effectively utilize the computing resources since greater part of desktop -computer users run various flavors of the Microsoft Windows operating system. On the purpose, a Microsoft Windows -based grid-computing infrastructure could play a critical role in the industry-wide adoption of grids [6] [7] [8] due to the large-scale deployment of Windows within enterprises. This enables the harnessing of the unused computational power of desktop PCs and workstations to create a virtual supercomputing resource at a fraction of the cost of traditional supercomputers. However, there is a distinct lack of service-oriented architecture-based grid computing software in this space. To overcome this limitation, we have chosen a Windows -based grid computing framework called Grid .NET framework implemented on the Microsoft .NET Platform. While the notion of grid computing is simple enough, the practical realization of grids possesses a number of challenges. Key issues that need to be dealt with are security, heterogeneity, reliability, application composition, scheduling, and resource management [11]. The Microsoft .NET Framework [5]
provides a powerful toolset that can be leveraged for all of these, in particular support for remote execution (via .NET Remoting [9] and web services [10]), multithreading, security, asynchronous programming, disconnected d ata access, managed execution and cross -language development, making it an ideal platform for grid computing middleware.
IV. ARCHITECTURE
Our goal is to provide a Grid computing framework with a set of tools and API which will provide access to the Grid. The basic structure of the Gird would consist of different nodes, Executer and User). The Manager’s application will perform administrative duties including dispatching, scheduling and monitoring task on the grid. On the executer’s end there will be an application responsible for running task on the work agent. There will be tools to monitor and administer resource utilization. There will be a set of API’s provided to the user for submitting task (application) that needs computational resources. Our goal is to incorporate web services for interoperability (cross platform support) and to provide a gateway for inter grid communication and also support other currently available framework. Coming to submission of job our goal is to provide an Object Oriented Grid Thread Model to provide a user friendly platform for development of new Grid Applications besides that traditional job submission through set of input files will also be provided.
Our goal is to provide role based security. Authentication would be simple password based. Authorization would be based on the roles of the user. Another idea is to record all jobs/task performed on the grid would be stored in a database which will be linked to user account for authentication. To ensure security our aim is to run remote jobs in a sandbox as background processes so that other user applications are not affected.
V. SYSTEM DESCRIPTION
An administrator in the system would manage User Accounts, Executions Logs and Monitoring Tools. Administrator must be proficient in operating a computer and should have good knowledge of Windows OS as the administrator will be using the tools provided by our system. Grid Application Programmer uses the set of API provided and submits applications to the manager. A Grid user must be proficient in operating a computer and should have good knowledge of Windows OS and also should be well versed with .NET 2.0 as the Grid user would be using it for the application that is to be run on the grid. Executor Node carries out the actual execution of applications. An executor user must be proficient in operating a computer and should have good knowledge of Windows OS as the executor will be using the tools provided by our system. Furthermore they should have an idea about IP and ports as the tools being used require them.
190391-IJECS-IJENS © October 2009 IJENS I J E N S Processing speed: 1.5 GHz or higher Minimum of 256 MB
RAM. Microsoft Windows Operating System. (Windows 2000, Windwow XP Windows 2003 Server)Would require .Net Framework to run. Local securities such as firewalls may become an obstacle for our systems communication. Our system's operating language is English. Performance of the system largely depends on the Hardware and the operating system which in our case is MS Windows. A 32bit Application Would support x86 Architecture (Intel & AMD ).
VI.
H
ARDWARES
OFTWARE ANDC
OMMUNICATIONI
NTERFACEThe network adaptors we will be using are a modem and a LAN cards. They will provide connectivity between different components of our system Provides us the ability to exchange data which will be in application format. We will require all devices which support networking such as switches, routers etc. Our system requires connectivity between multiple components simultaneously; these devices will help us achieve that kind of connectivity. The .NET Framework will be used to build applications which can be executed on the system. After the application is completed it will be passed on to the system to be executed. The application will be developed using visual studio on a .Net framework. After the application is completed it will be passed on to the system to be executed. MS SQL server will be used in our system as a database to store and retrieve data and results respectively. Our system will have transactions with the database relating to insertion of data, resource related data and verification. Since our system requires connecting to other components and to exchange messages and data. Most computers today have firewalls that restrict external application from connecting to various internet ports therefore we need to communicate with the Windows ICF to allow our system access through the firewall, so that the communication is easy. This will include informing the Windows ICF to unblock required ports. The Grid Framework will extensively utilize High Level Protocols such as HTTP and SOAP through Web Services. All the major components of the system will be able to communicate through Web Services for the purpose of Interoperability. TCP\IP will be used as the main underlying communication protocol between the communicating entities. Access to TCP\IP protocol would be through the .NET 2.0 API using the .NET Remoting.
VII. HARDWARE ARCHITECTURE
Grid.Net Architecture is divided into five main layers. The layers act as a middleware between the Grid Applications and the Operating System [14]. The middleware is further divided into two parts: User level middleware and the Core level middleware. The first layer, consisting of the Programming Model, and the second layer, consisting of the Grid.Net Threads, form the User level middleware and interact with the Grid Applications and the Core level middleware [15]. User has no interaction with the Core level middleware directly and is used only through the User level middleware. Core level
consists of the Manager Application and Resource Management Protocol layer. Resource Management Protocol is responsible for Scheduling of Jobs to the Executors and is
directly used by the Manager Application.
Fig. 1. System Architecture of .NET Grid Computing
190391-IJECS-IJENS © October 2009 IJENS I J E N S Fig. 2. System Level Architecture
The user component submits the Application threads to the supervisor and collects the completed threads on behalf of the Application developer through the Grid.NET API.
VIII.
S
YSTEMD
ESIGN GridApplication:Properties:
Status: Defines what state the application is in. Methods:
CreateObjRef: This method creates a reference for an object of the application.
StartApp: This method is to start the application execution. StartThread: This is used to start a thread.
StopApp: Used to stop an application.
Fig. 3. Detail System Design
GridThread: Properties:
Application: ID of the application which the thread belongs to and is assigned by the Grid Supervisor.
Priority: Defines what the importance of the thread is.
ThreadID: unique ID given to each thread to distinguish between them.
Status: What the state of the thread is.
Method:
Abort: It is to Abort the execution of a running thread. CreateObjRef: This method creates a reference for an object of thread.
Start: Used to start the execution of a thread.
GridJob: Properties
Inputfile: The collection of input files. Outputfile: The collection of output files. Status: Stores the current state of the job. Methods:
Abort: To stop a running job. Start: Begins a job.
GridUser: Properties:
Passport: It is required to logon to the GridSupervisor. Methods:
Connect: Establishes a connection with the GridSupervisor. Disconnect: Ends the connection with the GridSupervisor. SubmitApplication: Used to submit the application.
GridWorker: Properties:
Connection: The status of the connection established.
HeartBeatInterval: The time between the heart beat to check whether still working or not.
Supervisor: ID of the supervisor it is connected to. SupervisorEndPoint:
Methods:
AbortThread: It stops a running thread. ExecuteThread: Used to execute a thread.
Disconnect: Ends the connection with the GridSupervisor.
GridAppDomain: Properties:
Domain: Identification to distinguish the domain.
Worker: ID of the worker on which the domain is created.
CreateObjRef: This method creates a reference for an object of domain.
ExecuteThread: Start running the thread.
GridException:
This class is used to represent the Grid Application exceptions GridPassport:
Fields:
Password: Required to logon to the GridSupervisor in combination with the correct Username.
190391-IJECS-IJENS © October 2009 IJENS I J E N S GridConnection:
Properties:
Host: IP Address of the connecting Node Port: The Port no.
Methods:
StartConnection: Establishes connections between nodes.
GridNode: Properties:
Passport: Required to establish connection between nodes. LocalEndPoint:
Scheduler:
Application: ID of the application to be scheduled. Threads: ID of the thread to be scheduled.
Job: ID of the job to be scheduled.
Workers: ID’s of the workers the work is being scheduled to.
DatabaseStorage: Fields:
Properties
Password: Required to logon in combination with the correct Username.
Username: Required to logon in combination with the correct password..:
Methods:
AddApplication: adds another application to the supervisor to be run after scheduling.
AddJob: adds another Job to the supervisor to be run after scheduling.
AddThread: adds another thread to the supervisor to be run after scheduling.
Addworker: adds another worker to the supervisor to run the
work on.
Connect: Establishes a connection.
DeleteApplication: Deletes applications after they have been completed.
DeleteJob: Deletes jobs after they have been completed. DeleteThread: Deletes threads after completion.
IX.
CONSTRUCTION AND INSTALLATION At present, we have implemented a Standalone Desktop Grid platform, consisting of six Executors (Pentium IV 1.7 GHz desktop machines with 512 MB physical memory) and a Manager (Pentium IV 2.4 GHz with 512 physical memory) at the beginning phase. Significant methods are in progress to extend the level to higher ones. So far, a stand -alone, stable and reasonably capable workstation, with the enough disk storage of 40 GB running Windows XP Professional with Microsoft .NET Framework 2.0 and SQL Server 2000 for the database management of users, is used as manager node (MGRID).
The MGRID is installed in Normal Windows desk top
application setup. It provides services associated with managing execution of grid applications and their constituent threads. The Manager Port is set to, 900. Six Executers
(EGRID1, EGRID2, EGRID3, EGRID4, EGRID5,QEGRID6)
with 40 GB hard disk running windows XP, are installed with the Executer application package and other required software packages. The host port of each Executer is set respectively to 901, 902, 903, 904, 905, 906 and the manager port as well. All executers are set as dedicated executers; a non-dedicated executor executes grid threads on a voluntary basis (it requests threads to execute from the Manager), while a dedicated executor is always executing grid threads (it is directly provided grid threads to execute by the Manager). Dedicated execution is more suitable where the Manager and Executor are on the same Local Area Network or a part of Intranet while non-dedicated execution is more appropriate when the Manager and Executor are to be connected over the Internet. Since our gird is based on Uni-level Intranet grid-framework, we have used a switch working as central hub between the Manager and all Executer nodes. Each Executer is now ready to register itself with the Manager, which in turn monitors their status.
Threads received from the User are placed in a pool and scheduled to be executed on the various available Executors. A priority for each thread can be explicitly specified when it is created or submitted. Threads are scheduled on a Priority and First Come First Served (FCFS) basis, in that order. The Executors return completed threads to the Manager, which is subsequently collected by the respective users. A scheduling API is provided that allows custom schedulers to be written. The Manager employs a role-based security model for authentication and authorization of secure activities. A list of permissions representing activities that need to be secured is maintained within the Manager. A list of groups (roles) is also maintained, each containing a set of permissions. For any activity that needs to be authorized, the user or program must supply credentials in a form of a user name and password and the Manager only authorizes the activity if the user belongs to a group that contains the particular permission. Number of users can be added to our grid. Grid applications are executed on the User node. The API abstracts the implementation of the grid from the user and is responsible for performing a variety of services on the user’s behalf such as submitting an application and its constituent threads for execution, notifying the user of finished threads and providing results and notifying the user of failed threads along with error details.
X.
PERFORM ANCE AND BENCHM ARKING Gird to support the execution of applications created using Alchemi Grid Threads interface, in this phase, we will treat its job model to submit jobs for processing on it.
190391-IJECS-IJENS © October 2009 IJENS I J E N S workloads (calculating 1000, 1200, 1400, 1600, 1800, 2000
and 2200 digits of Pi), is executed on EGRID 1. Similarly, the range of similar workloads is executed step by step over increasing number of Executers incremented by 1++. The workload is segmented into numerous grid threads, each to calculate 50 digits of Pi, with the number of threads varying proportionally with the total number of digits to be calculated. Finally execution time, in accordance with the increasing number of Executers, is measured as the intervened clock time to complete the calculation of the digits of Pi on the Manager node (MGRID)
Fig. 4. a plot of thread size vs. execution time on a standalone Grid.
XI.
CONCULSIONS AND FUTURE WORK Keeping the component based design in mind it would be easier to add future components into our software without exposing much of the underlying code. Although no future extensions or enhancements are planned. There is likelihood that with introduction of .NET Framework 3.0 the software design would be ported to it and support for Windows Vista could be added later on. Also there is current interest in porting the application to x64-bit architectures but considering the future trends we would surely love to see an x64 bit version of our application. Another future system extension that could be possible but currently out of our scope is cross platform support.
I believe that one day, we will be synchronized with the other nations have made remarkable achievements in this emerging field. So far, we have configured a. NET-based desktop Grid computing framework providing the runtime machinery and object-oriented programming environment to easily construct grid applications. We dream to extend the number of executers and merging our grid into Global Grid forum, providing grid capabilities to enable resource providers to share their resources to harness greater power.
XII. REFERENCE
[1] Muhammad Mujahid Iqbal: “ First Practical Approach towards Homegrown Supercomputer in Pakistan” ISBN: 978-1-4244-2293-7 2bd ICEE 25-2 6 March 2008 .
[2] Pfister, Gregory F. In search of clusters: The coming battle in lowly parallel processing, Prentice Hall. ISBN: 0-13-899709-9, Upper Saddle River, NJ, USA, 1998.
[3] Compact Muon Solenoid Experiment. http://www.ipp.phys.ethz.ch/research/
[4] General information about CMS. CMS Outreach. http://cmsinfo.cern.ch/Welcome.html/
[5] Microsoft Corporation, .NET Fram ework Hom e, http://msdn.microsoft.com/netframework/
[6] Andrew. Chien, Brad. Calder, Stephen. Elbert, and Karan. Bhatia, Entropia: Architecture and Performance of an Enterprise Desktop Grid System, Journal of Parallel and Distributed Computing, Volume 63, Issue 5,
Academic Press, USA, May 2003.
[7] David. Anderson, Jeff. Cobb, Eric. Korpela, Matt. Lebofsky, Dan. Werthimer, SETI@hom e: An Experim ent in Public-Resource Com puting, Communications of the ACM, Vol. 45 No. 11, ACM Press, USA, November 2002.
[8] W. T . Sullivan, D. Werthimer, S. Bowyer, J. Cobb, D. Gedye, D. Anderson, A new m ajor SETI project based on Project Serendip data and 100,000 personal com puters, Proceedings of the 5th International Conference on Bioastronomy, 1997.
[9] Piet. Obermeyer and Jonathan. Hawkins, Microsoft .NET Remoting: A Technical Overview, http://msdn.microsoft.com/library/en-us/dndotnet/html/hawkremoting.asp (accessed November 2003) [10]Intel Corporation, United Devices’ Grid MP on Intel Architecture,
http://www.ud.com/rescenter/files/wp_intel_ud.pdf (accessed November 2003)
[11] M. Mutka and M. Livny, The Available Capacity of a Privately Owned Workstation Environm ent, Journal of Performance Evaluation, Volume 12, Issue 4, , 269-284pp, Elsevier Science Publishers, T he Netherlands, July 1991.
[12] Abbas, Ahmer. Grid Computing: A Practical Guide to technology and Application,. Firewall Media. ISBN: 81-7008-626-4, Laxmi Publications, New Delhi, India, 2004.
[13] Larry Smarr and Charlie Catlett, Metacomputing, Communications of the ACM Magazine, Vol. 35, No. 6, pp. 44-52, ACM Press, USA, Jun. 1992.
[14] De Roure, D., Baker, M. A., Jennings, N. R., & Shadbolt, N. R., T he Evolution of Grid.