• No results found

P M W C omm Syn c RD a d d r e ss Syn c d o ma in ta b le L C o m m L C o m m L C o m m hos t table dom ain table

dom ain table proc es s table

dom ain table proc es s table P M W C omm L C o m m L C o m m L C o m m Syn c RD , lo ca l P MWCo mm a d d r e ss Syn c d o ma in ta b le , p r o ce ss ta b le

Figure 5.10: Information is synchronized hierarchically.

Table 5.1: A domain table

Domain ID LComms Implementation Cluster frontend (PWMComm contact)

1 Unix socket 128.16.6.250

2 PVM 128.21.3.197

3 MPI 128.1.8.12

5.3.5.2

Domain Table

A domain table lists the domain ID and the contact details of related domain. Contact details include the LComm implementation approach of the cluster/domain, and the IP addresses of the cluster frontend where the PMWComm is placed. For example, Table 5.1 lists the information of three clusters, where LComms in cluster 1 is implemented by Unix socket, LComms in cluster 2 is implemented based on PVM, and LComms in cluster 2 is implemented based on MPI. If any process in cluster 2 wants to communicate with processes in cluster 3, it first detects that the destination process is from cluster 3 by extracting its domain ID from its UPID. Then it maps the domain ID to the related PMWComm contact details. It finally contacts the related PMWComm and asks it to forward the message to the destination process via the related protocol (MPI) in cluster 3.

5.3.5.3

Host Table

The host table lists the information of all the processes and related host in the virtual machine, including host name and the related probe job ID. It is used to release the host when killing a process. When a process is killed, the RD will look up the host table to find out the related host and job information. It releases the host by killing the probe job. e.g. qdel 4019372. It also provides users a global view of the virtual machine when users use the AA console. This host table only stays in the NameServer and is not synchronized to other daemons. An example of the host table is shown in Table 5.2.

5.4

Implementation

In this section we introduce how we implemented AA based on the architecture introduced.

5.4. Implementation 63

Table 5.2: A host table

UPID Domain ID DRM Job ID Host name Host attributes

4259842 1 Condor 12788.0 calvini.cs.ucl.ac.uk LINUX, INTEL 32

4259843 1 Condor 12789.0 eco.cs.ucl.ac.uk LINUX, INTEL 32

4325377 2 SGE 4019372 chico-13-17 LINUX, INTEL 64

4325388 2 SGE 4019373 chico-13-18 LINUX, INTEL 64

PMW Co mm

LComm LComm LComm

PVM PVM

RD

Domain 1 PVM intstalled

PMW Co mm

LComm LComm LComm

LAM/MPI LAMMPI Domain 2 LAM/MPI installed

PMW Co mm

LComm LComm LComm

MPICH MPICH Domain 3 MPICH installed

PVM LAM/MPI MPICH

XML RPC XML RPC

XML-RPC

Figure 5.11: The implementation of the AA.

various distributed computing environments configured with different software (Figure 5.11). This is achieved by implementing common interface to various existing software that provide interprocess com- munication and process management for the LComm daemon. The common communication interface provides flexible access to established communication methods such as Unix socket, PVM and MPI. The common process management interface abstracts a number of function modules to leverage existed methods such as SSH, PVM and LAM/MPI [BDV94]. To port the AA to different environments, the AA programmers only need to re-implement the LComm interface. The current implementation of LComm (Section 5.4.1) is based on PVM.

The PMWComm daemon is implemented as a XML-RPC [Win99, SSL01] server. XML-RPC is a Remote Procedure Calling protocol that uses XML to encode its calls and HTTP as a transport mech- anism. We chose XML-RPC instead of other protocols (e.g. CORBA [cor], SOAP [soa]) because it is easy to use and its loose-coupling feature makes it possible to connect a large number of geographically distributed clusters. Also the based HTTP protocol is firewall-friendly and enables the AA to deploy proxies to any parts of the Internet without specific configurations. A PMWComm daemon talks to its external daemons through XML-RPC, while talking to its internal LComms via the LComm communica- tion protocol which is based on the chosen library (Figure 5.11). The remote invocation of PMWComm is performed through SSH.

The RD daemon is also a XML-RPC server. It communicates with other daemons via the XML- RPC protocol. It is spawned via the fork-exec call on the local machine where users invokes an appli- cation. The RD contacts cluster DRMs through a special program RAS (Resource Acquisition Server) which is pre-placed on the submission machines of clusters. A RAS is another XML-RPC server which

5.4. Implementation 64 RD RAS Subm it m ac hine D RM dae m o n H o st pro be 1.reques t via xm l-rpc 2. s ubm it via s ys tem c all

3. alloc ate a hos t 4. grant the res ourc e via xm l-rpc

Figure 5.12: The process of obtaining a resource.

serves the host requests from all AA-enabled applications. Currently it uses the system call to submit the probe job for resource allocation. When it receives a request from a RD, it generates a related submission file, and uses the system() call to execute the job submission command that will finally submit the job. For example, in a Condor cluster it generates a probe.submit file and submit it by system(“condor submit probe.submit”). Once a host has been allocated, the probe gains the host information (host name, op- erating system, CPU architecture, and CPU load) by reading Linux /proc/ directory and reports the information back to the requested AA. Figure 5.12 depicts the process of obtaining a resource from a DRM.

The AA is currently implemented in C++. Communication security is provided by the standard UNIX environment including SSH and RSH. File transfers are performed by SCP and RCP. The AA takes advantage of the GNU LGPL software XmlRpc++ [xml] to implement the XML-RPC mechanism and the AFPL software XMLparser [Ber] to parse the XML file.

5.4.1

PVM Implementation

PVM is a software system that enables a collection of heterogeneous computers to be used as a coherent and flexible concurrent computational resource. It has a flexible process management system and a communication system which the AA can make full use of.

After application invocation, the AA library starts a LComm (actually a pvmd) by pvm start pvmd(). If a PVM has already been started on that machine, the LComm will attach itself into the system. To add a process in the local cluster, the LComm first adds a granted host into the virtual machine by placing a new LComm on that host via pvm addhost() and pvm spawn(LComm), and asks the new LComm to start the process. Since PVM is bounded to join resources that are located in the same network domain, to start a process on an external domain host, the AA places a PMWComm daemon on the frontend of the remote domain. After remote invocation, the PMWComm starts another PVM in the remote cluster, and then adds the granted host to the PVM and starts the process by pvm spawn(). After successful spawning, processes register themselves to the PMWComm which will then place (UPID - PVM tid) pairs into the local process table.

The communication between LComms is based on PVM’s communication mechanism. To send a message to a host in the same cluster, the message is packed into a PVM message buffer (pvm pk*()) and routed through pvmd by PVM send routine (pvm send(tid), where tid is interpreted from its UPID).

5.5. Evaluation 65