Formulation of Homogenous Cluster Environment using Commodity Grade computers and MPI Paradigm

(1)

93

Formulation of Homogenous Cluster Environment

using Commodity Grade computers and MPI

Paradigm

Tadrash Shah1, Neel Patel1, Nishidh Chavda2

1

B.E. (Computer Engineering).

2

Asst. Professor, CIT-Changa

Abstract - With the advent of market being flooded with cheaper processing power and memory, there is a revolutionary shift from developing a single high end machine to combining cheaper and commercial machines to serve the same purpose. In the interest of said proposition clusters have been the most effective option.

This paper focuses on formulating and implementing simpler ways for homogeneous, private, high-performance, Beowulf clustering environment, where in the nodes thus connected and configured collectively execute a bigger tasks by breaking them up into processes, and each such process run parallel. The task gets distributed on various nodes through processes and the final result is obtained at a single node form where the original task was submitted. The processes on different nodes execute in parallel. The Beowulf clustering environment was setup and used with the MPICH2 message passing interface. The output trace shows the id of processes running currently and also the hostname at which node they run.

Keywords - Beowulf, Cluster, mpi, mpich2, mpiexec

I. INTRODUCTION

Cluster, a buzz in the market and research, apart from the Grid and the Cloud technologies, which are evolving so drastically that a lot of work, is being done in the said areas and even more needs to be done still. Cluster can simply be described as a set of integrated computers,

called nodes in cluster taxonomy, connected in a network and usually encompassed by an envelope that gives the end user the feel of a single integrated system. The nodes may be spaced geographically distant. The most important facet of the cluster is it should provide a unified and coherent single system interpretation to the end user, though the developer may work on multiple nodes for configuration and maintenance purposes.

Cluster can be viewed as simultaneous execution of the computational tasks on multiple processors with the aim of obtaining quick results [5]. Chief intention of setting up a cluster is to provide high-performance computing as also distributing the large computations from a single node to other nodes, of the network, belonging to the single system envelope. The task to be run in parallel can be submitted to a master node which in turn breaks up the task into different processes and distributes to various nodes to run in parallel and final results can cohesively be obtained again from the master node. This kind of setup is often referred to as Master-Slave mechanism. This mechanism has been a core part of the cluster being setup by the authors.

(2)

94

Heterogeneous Cluster. In the view of this paper we have mainly focused on the homogenous cluster setup.

A. Basic features of a cluster

A cluster, after the setup, must possess these basic features.

1. High throughput compared to the single machine, for the same task

2. Expandable to add more nodes and scalable to serve the need of applications.

3. High performance gain compared to a single system.

In a cluster, a node may communicate with another node with the message passing paradigm, messages are passed between the logical tasks to share data and to synchronize their operations [12]. The results of the task submitted remain coherent.

As said earlier this should also provide the end user with the feel that the results that he/she is obtaining are from the single system and not from a cluster of systems. This is usually termed as Single System Image (SSI) of the cluster. A cluster should provide this feature so that the end user is saved of the technical ado for submitting the tasks and obtaining results. This concept also saves from the operator errors as the mechanism and the system may be kept centralized or decentralized, as desired, to avoid the need of skilled administrator for the cluster.

A cluster also requires job scheduling. The queues to be maintained for the job scheduling may be a single queue on the master node or a multiple queues, single queue on each slave node. This characteristic may vary as per the requirements for which the cluster is being setup. It must also be noted that if multiple queues are used then all of them must be coordinated and synchronized for the purpose of coherence and cohesiveness.

Final look on the security of the cluster, it must be made secure through various algorithms if it is exposed to the external environment, however final decision may be rested upon the fact that the cluster is being set up for which purpose. Usually for an inter-organizational cluster the cluster security may be not be a dire requirement. At the same time security must not be ignored even if a

single node is exposed to the external environment, as it may prove as a vulnerable gateway to the entire cluster.

The next section describes the characteristics of the cluster that was setup and the libraries that were utilized.

II. REPRESENTATIVE CLUSTER AND LIBRARIES USED

A. Beowulf Cluster

The cluster that we attempted to setup was Beowulf cluster. The name Beowulf comes from the old epic English poem Beowulf. A Beowulf cluster needs a central node that does some coordination among all the nodes. It defines no particular piece of software or a tool or a library. It simply identifies the interconnection of small commodity grade computers to form a homogenous cluster. This cluster can be setup on the Unix or the Linux operating system, though various flavors may be used. A Beowulf cluster is capable of many things that are applicable in areas ranging from data mining to research in physics and chemistry, all the way to the movie industry [7]. It is mainly used for two types of parallelism namely, embarrassingly parallel and explicitly parallel [7].

We used the concept of explicit parallelism [7] in Beowulf cluster. The program that we implemented on this cluster was one of the sample programs that come with MPICH2 library. The program that we used calculated the value of PI (π) through distributed and parallel processes approach. If you have the closer look at the program, essentially a C program, it implements various parallel message passing operations through the MPI calls.

B. MPICH

(3)

95

a standard for message passing in the cluster system. MPI is not and IEEE or an ISO standard but essentially an “industry standard” [8]. Various implementations and libraries are available for this standard, namely MPICH, Open MPI, LAM-MPI, etc. It defines interface specification in C, C++ and FORTRAN languages. There are defined MPI calls and primitives for the same. These can be found and explored in the MPI manual.

For this purpose we have used the open source library that allows the MPI implementation in C. It is named MPICH; developed by Mathematics and Computer Science Division, Argonne National Laboratory. The version 2 of the MPICH has been used for setting up the cluster for which the steps are discussed below, MPICH2. MPICH2 is a high performance and widely portable implementation of the message passing interface (MPI) standard [9]. This is an open source and easy to extend modular framework. The only drawback that we came across was the fact that MPICH2 cannot be used with the cluster of system that has heterogeneous representation of the data. And that is the reason that we chose the homogenous cluster setup. Default runtime environment for the MPICH2 is called Hydra [10]. Although other managers are available.

The section following will explore the steps and the necessary housekeeping required for setting up the private Beowulf cluster.

III. SETTING UP CLUSTER

A. Cluster nodes

We setup a cluster on 3 nodes initially which was expanded there on. For the paper we will continue with three systems that we used for clustering, though the same can be extrapolated for any number of nodes. Out of them one was the master node and the rest were slave nodes.

B. Hardware

We used the integrated homogenous PC of the configurations said as under -

 Intel Core2 Duo processor, 2.93GHz

 1GB RAM

 250 GB hard disk

 No extra graphics card

 Already installed Ethernet card

C. Software

The system consisted of dual boot between Windows XP and Red Hat Linux 5.0. We used the bash shell of Linux for running commands. Required Parallel Virtual Machine (PVM), already installed on RHEL. The C/C++ compiler “gcc” was also already installed on the same. Later we were required to install MPICH2 on the system, which will be discussed later on. The root user of the system was root. All the systems had this user, and the cluster was installed for this user itself. Although a different user may be created for the purpose of installation of cluster.

D. Network

The price per node of any system consists of the cost of network interface, and amortized to the cost of network switches. A switch was used to connect the PCs. We used the switch 1 to 100GBPS, 24 port Ethernet switch. All the communication was allowed through TCP.

The internal nodes are not directly connected to the external internet the reserved set of IP address can be used [12]. IP addresses of class B were used in the range of 172.16.0.1 and onwards. We had 172.16.0.1 as the master node and the following 172.16.0.XX as the slave nodes. It should be also noted that the IP configuration was static. The host names were given as, master node with node00 and the slaves as node01 and onwards. The network, thus setup was to support a standalone cluster of PCs. No special topology was designed for the purpose. All the nodes were connected through the switch.

E. Configuration in Linux

(4)

96

Next we were required to alter certain Linux files for our purpose. If some of them were not available they were created. We installed the cluster on root user only. So we created a file named .rhosts in the root directory on all the nodes. This file was replicated on all the nodes. The file was something like this –

node00 root node01 root node02 root

This allowed the users to connect to the said host names through remote shell. And it reads that each node will use its root user for clusters. If that needs to be done for a particular user on a particular node, then that user name should be mentioned in rhosts file.

Next was required to create a file named hosts in the /etc directory. Through this hosts file we will allow the communication among all the nodes with the help of IP configuration of the node that we did before hand.The file for node00 was as under –

172.16.0.1 node00.home.net node00 127.0.0.1 localhost

172.16.0.2 node01 172.16.0.3 node02

This file for node01 will be as under –

172.16.0.2 node01.home.net node01 127.0.0.1 localhost

172.16.0.1 node00 172.16.0.3 node02

Similar steps were followed for the rest of the node.

Next was required to allow each node to access every other node, so the permissions were to be set. They were set by changing the hosts.allow file as

ALL+

There was this single line in the file. This was a loosely set parameter that we set and allow almost everything without any of the security constraints. This may pose a security threat if used for some highly secretive purpose or for sensitive data.

Next we configured the remote shell. We faced the problem that rsh was not installed on our OS. That was done through running the yum command and installing the missing package. The parent package for rsh was xinet, and then the child package was rsh-server. Both were installed. It was found that some OS already had rsh installed, if so then this step could be saved on those OS.

Next we were required to add the security permissions to the root user for rsh and related components. Added to the /etc/securetty file-

rsh, rlogin, exec, pts/0, pts/1

Next we modified the /etc/pam.d/rsh file –

#%PAM-1.0

# For root login to succeed here with pam_securetty, "rsh" must be

# listed in /etc/securetty.

auth sufficient

/lib/security/pam_nologin.so

auth optional

/lib/security/pam_securetty.so

auth sufficient

/lib/security/pam_env.so

auth sufficient

/lib/security/pam_rhosts_auth.so

account sufficient

/lib/security/pam_stack.so service=system-auth

session sufficient

/lib/security/pam_stack.so service=system-auth

Next it was required to enable the rsh, rlogin, telnet and rexec, they are disabled by default in RHEL. That was done by modifying each file with the name bearing the name of the services that we needed in /etc/xinetd.d, where the

(5)

97

socket_type = stream wait = no user = root

log_on_success += USERID log_on_failure += USERID

server = /usr/sbin/in.rshd disable = no

}

Then we restarted the xinetd service, by xinted –restart command on the bash shell.

F. Installing MPICH2

The MPICH2 package was downloaded from the site http://mcs.anl.gov/mpi/mpich/download.html. This was installed on the master node only. MPICH2 was installed with the steps as under –

We had this file named mpich2-1.4.1.tar.gz. We untar this file in the folder of the name mpich2-1.4.1. Then MPICH was installed using the terminal. Migrating to the MPI directory that was just created, the command was fired in bash shell –

./configure

This command sets up the check for the dependency that must be checked before installation of any package. After the configuration completes the command –

make

was issued to setup the environment for the MPICH installation. All ran fine and finally MPICH2 was installed with a command

make install

This took a few minutes to install.

G. Running programs

Once installed, we started executing the sample programs from the examples directory of the mpich2-1.4.1 directory. These programs were executed through the command, also mentioned in the MPICH2 documentation –

mpiexec -n <number> ./examples/cpi

where <number> is the number of processes you want to submit to the master for execution. It can be any numeral digit.

Also we controlled the number of processes that were to be submitted to each node. That was done by creating a machinefile in the mpich2-1.4.1 directory. The machinefile took the form as under –

node00 : 2 node01 : 4 node02 : 6

The numbers that followed “:” indicate the number of processes that were allowed to run on each node, indicated by a host name. These are the numbers that control the processes submitted to each node. We found that default was set to 1. Each node cannot run more processes than indicated in machinefile.

The command used to run the program in this way is –

mpiexec -f machinefile -n <number> ./examples/cpi

IV. OUTPUT

With the command that we fired as :

mpiexec -f machinefile -n 16 ./examples/cpi

and machinefile content as under -

node00 : 2 node01 : 4 node02 : 6

we had output something like this –

(6)

98

Process 9 of 16 is on node02 Process 10 of 16 is on node02 Process 11 of 16 is on node02 Process 12 of 16 is on node02 Process 13 of 16 is on node00 Process 14 of 16 is on node00 Process 15 of 16 is on node01 Process 16 of 16 is on node01 pi is approximately

3.1416009869231249, Error is 0.0000083333333318

Next we tested the same for another command with even more number of processes and the machinefile edited with drastic difference in number of processes allowed in a node -

mpiexec -f machinefile -n 32 ./examples/cpi

and machinefile content as under -

node00 : 1 node01 : 12 node02 : 5

we had output something like this –

Process 1 of 16 is on node00 Process 2 of 16 is on node01 Process 3 of 16 is on node01 Process 4 of 16 is on node01 Process 5 of 16 is on node01 Process 6 of 16 is on node01 Process 7 of 16 is on node01 Process 8 of 16 is on node01 Process 9 of 16 is on node01 Process 10 of 16 is on node01 Process 11 of 16 is on node01 Process 12 of 16 is on node01 Process 13 of 16 is on node01 Process 14 of 16 is on node02 Process 15 of 16 is on node02 Process 16 of 16 is on node02 Process 17 of 16 is on node02 Process 18 of 16 is on node02 Process 19 of 16 is on node00 Process 20 of 16 is on node01

Process 21 of 16 is on node01 Process 22 of 16 is on node01 Process 23 of 16 is on node01 Process 24 of 16 is on node01 Process 25 of 16 is on node01 Process 26 of 16 is on node01 Process 27 of 16 is on node01 Process 28 of 16 is on node01 Process 29 of 16 is on node01 Process 30 of 16 is on node01 Process 31 of 16 is on node01 Process 32 of 16 is on node02 pi is approximately

3.1415926535902168, Error is 0.0000000000004237

Thus it was seen from the two outputs described as above that the processes get distributed as per the policy that are decided by the machinefile. It was further also noticed that if a process, by the policy of machinefile, should go to a particular node, but that node being busy with some other processes and cannot accept the processes that are sent to it from the master node, the process was then transferred to the subsequent node that can accept the process and has not reached the maximum number of processes by policy. Such rigorous tests were done for the said example program and the results were not at all dissatisfying. The parallel execution of the processes, as seen from the output trace was blatant.

And hence it was seen that the nodes were interconnected and processes migrated amongst the nodes. Any application that uses the MPI calls can further be developed on this platform. The program can be in C/C++ of FORTRAN.

V. CONCLUSION

(7)

99

firewall. This platform is scalable enough to deploy clustering applications, the authors focus on such application in the interest of data mining. The authors appreciate hearing from anyone who can further help accomplish the scopes said prior.

ACKNOWLEDGMENT

The authors would like to acknowledge and appreciate the infrastructure and the resources that were provided by Charotar Institute of Technology, throughout the course of research and study.

REFERENCES

[1] Rajkumar Buyya, "High Performance Cluster Computing", Vol 1, Pearson Education, 1999.

[2] Amit Jain, "Beowulf Cluster Design and Setup", Boise State University, April 2006

[3] http://www.beowulf.org

[4] http://www.beowulf/underground.org

[5] Dr. Deven Shah, "Advanced Computing Technology",Dreamtech Press, pp.2, 2011.

[6] T. Sterling, J. Salmon, D. Becker and D. Savarese, "How To Build a Beowulf", MIT Press, 1999

[7] http://fscked.org/writings/clusters/cluster-1.html [8] ttps://computing.llnl.gov/tutorials/mpi/ [9] http://www.anl.gov/research/projects/mpich2/ [10] MPICH2 User's Guide

[11] http://linuxjournal.com/article/5690

[12] T. sterling, “Beowulf Cluster Computing with Linux”, MIT Press, October 2001

[13] R.J. Allan, S. J. Andrews and M.F. Guest, “High Performance Computing and Beowulf Clusters”, 6th European SGI/Cray MPP Workshop, Manchester, 7-8/9/2000.

AUTHORS

Tadrash Shah obtained his

bachelor’s degree, B.E. in Computer Engineering from Gujarat Technological University. He stood first in his college in Degree Engineering. He has published his paper on “A study of ensemble applications of Occam’s Razor” recently. He is interested in the research in the subjects like Algorithms, High-performance computing and Databases. He has been an intern at IIT-Gandhinagar and worked on the project titled” Virtual Geotechnical Laboratory”, funded by MHRD. He currently works with IIT-Bombay for Spoken Tutorials project as a Promoter and Tutorials

Developer for FOSS. He is pursuing his apprenticeship at Indian Institute of Management, Ahmedabad.

E-mail : [email protected]

Neel Patel completed his

under-graduation in B.E. in Computer Engineering from Gujarat Technological University. He has co-authored the paper on “A study of ensemble applications of Occam’s Razor” recently. He is interested in the areas of Object Oriented Programming with Java, Database, Algorithms, High-performance computing. He has been a treasurer of the IEEE Student Branch at Charotar Institute of Technology, Changa and currently a student member of the same. He has also worked with Times of India under the movement called Teach India. He has worked with various companies towards various projects and is pursuing his apprenticeship at Indian Institute of Management, Ahmedabad.

E-mail : [email protected]

Nishidh Chavda is an Assistant

Professor at Charotar University of Science and technology (CHARUSAT). He has guided the other two authors in this paper. He completed his Masters Degee, M.E. in Computer Engineering from Dharamsinh Desai University, Nadiad. His research interests include Data Mining, Computer Networks, Computer Organization, Data Structures, Compiler Construction and High-performance computing.