• No results found

Helloworld using the SMG API

In the distributed shared memory programming model, the level at which the user is responsible for consistency dictates the semantics and style of the user API (currently only a C interface is provided). The API consists of a number of functions that allow the user to exploit the facilities provided by the system, such as locks and barriers. One of the most successful implementations of software-only DSM is Treadmarks, its API is simple and elegant (listed in Appendix C, page 221). The design of the SMG API follows a similar style to this API, and is given in full in Appendix E (page 243). In the Treadmarks model a thread is responsible for ensuring that it has correct access privileges to a variable through the use of synchronisation primitives. This principle applies even if a thread only requires read access. The SMG DSM introduces the same constraints; in order for a thread to ensure that it is using a consistent shared memory then the appropriate synchronising operation must be performed. A developer will construct an application specifying the control of a thread using flow control statements with reference to their identifiers.

All function calls in the SMG API return error codes indicating the return status of the function call. Typical values returned areSMG SUCCESS orSMG FAILURE. These return codes should be examined when returned to ensure that an application is correctly executing or that the action that was requested has taken place. Input and output parameters are specified as the arguments to the function.

6.4.1 DSM Initialise

The local actions essentially consist of the initialisation of the internal storage, DSM handles and queues. This SMG routine must be invoked before any other API call. The

HELLOWORLD USING THE SMG API 80

API initialisation function is given below. The first two arguments, argc and argv, are the arguments passed to the user application at run-time, which are in turn passed to the MPI initialisation routine. Theflagsargument specifies the environmental requirements such as information & monitoring services. The last argument type defaults defines what the requirements of the application developer are, and to specify what the default memory consistency models and coherence protocols are; this allows internal DSM engine optimisations.

i n t SMG init (i n t ∗a r g c , char ∗∗∗argv , i n t f l a g s , i n t t y p e d e f a u l t s ) ;

At the start-up sequence of the SMG DSM engine a number of tasks are performed:

• initialisation of the local internal DSM structures

• the start-up of the DSM system handler thread(s)

• establish the underlying communication channels between all processes. In the MPI implementation, this acts as a wrapper around the underlying initialisation call (i.e. MPI Init/MPI thread Init in the single and multi threaded versions respectively).

• the installation of the shared memory write trapping mechanisms, currently this is a system page-fault handler, which is described in detail in Section 7.3.1.

• If the information and/or monitoring system is required, then the application must register with it during the initialisation routine. If registration fails because the information and monitoring system is not available then the system will exit.

• The initialisation routine also acts as a global barrier that ensures that all threads of execution, wherever they be, perform such tasks before DSM requests can be invoked by remote processes.

The error code returned will indicate successful initialisation or failure. Sources of fail- ure can include the unavailability of requested information & monitoring services, and failure by the underlying communication system. Upon successful completion the value SMG SUCCESS is returned. At this point the global variables, such as SMG proc rank and SMG proc size, are valid, and, if specified, the application has been registered with the information system.

6.4.2 DSM System Environment

Although the global variables that specify the process pool size and a process’s rank are provided (SMG proc rank andSMG proc size respectively) for use by the developer after the initialisation call has returned, supplementary functions are provided that will dynamically return these values when required. The SMG prototypes for these functions are:

HELLOWORLD USING THE SMG API 81

i n t S M G p r o c e s s s i z e ( ) ; i n t S M G p r o c e s s r a n k ( ) ;

The first function will return the number of processes in the system, which will be less than or equal to the total number of user threads. Currently, this value is fixed at run-time due to the present lack of support for dynamic processes in MPI. The latter function will return the numerical rank of the calling process, in the range [0..N-1], where N is the total number of processes in the system.

A function to display internal DSM engine information (SMG print state) is provided. Additional functions are provided for the getting and setting of system attributes, e.g. these functions provided user access to the topology information if originally enabled. An API call (SMG module load) is also provided for the loading of additional modules that allow for the extension of the DSM, e.g. in areas such as shared memory management.

void S M G p r i n t s t a t e (i n t s t r e a m ) ;

i n t S M G i n t e r n a l g e t (i n t key , void ∗v a l u e ) ; i n t S M G i n t e r n a l s e t (i n t key , i n t v a l u e ) ;

i n t SMG module load (i n t MODULE TYPE, char ∗l o c a t i o n ) ;

6.4.3 DSM Finalise

In order for the system to exit cleanly the finalisation function call must be invoked. This ensures that all processes synchronise on exit, and guarantees that only when all processes are ready to finish they will actually do so, thereby preventing shared memory regions from being freed while they might still be required at a remote node. When all processes have reached this point all remaining shared memory regions are freed. Once all local cleanup routines have been called the information & monitoring systems can be signalled. Finally the underlying communication environment can be finalised. This function may act as a wrapper around the relevant communication function; with the MPI communications implementation, MPI Finalize called by the DSM system.

The SMG API finalisation call, called at the end of all SMG applications is: i n t S M G f i n a l i s e ( )

This function will block until it has completed successfully. Once this is done no further DSM services can be availed of. This function should only be called once, usually by the master user thread, otherwise the behaviour is unspecified.

6.4.4 DSM Abort

In certain circumstances it is desirable for an application to abort during execution for some reason local to one of the processes. An abort call is provided to allow the application as a whole to degrade gracefully, and perform housekeeping functions such as

HELLOWORLD USING THE SMG API 82

the closure of files, deregistering from the information system, and the reporting of errors to the developer. This call may form a wrapper around the underlying communication call (MPI Abort), so once it has been called no further interprocess communication is possible. This call will terminate all processes involved in the DSM application. In all cases the user specifies the error code,error code, to return to the invoking environment. The SMG API call to abort an application is:

i n t SMG Abort (i n t e r r o r c o d e )

6.4.5 User Multi-threading

Enabling multi-threaded user applications can result in significant performance gains [132]. Support will leverage hardware advances, such as multi-core processors, to better employ multiple user threads per node, allowing the DSM to fully exploit the available resources (say extreme exploitation of overlapping of computation and com- munication).

When creating user threads it is important that they are registered with the DSM run- time management system. This allows for setting of a thread signal mask in order to catch accesses to shared memory regions. Thread creation is requested using the SMG thread create API call, essentially a wrapper around the underlying thread library call. Only threads created using the SMG API call will be visible to the DSM system. If the API call is bypassed when the user thread is created and if a shared region is accessed an unintended SEGV fault will occur. More of the latent effects related to this decision are mentioned in Section 8.1.

At thread initialisation the appropriate cleanup handler functions are registered using thepthread cleanup push call, and so system routines that are required to be called be- fore the user thread exits are registered, thereby avoiding the developer having to do so manually in the application. An option that is allowed is to verify that the exiting thread does not hold any locks; if there is a lock held, then the appropriate measures may be taken. Currently, the event to logged to the monitoring system.

As the SMG API function for the creation of user threads is a wrapper around the the underlyingpthread library call, it presents the same familiar argument list to the devel- oper. The function parametertidis a thread identifier for the newly created thread; attr are the required thread attributes; start routine is the entrance function for the newly created thread; and arg is a reference to any input parameters that the user may wish to pass to the thread. The error codes returned by the function are the same as those returned by the underlying thread creation call.

i n t S M G t h r e a d c r e a t e ( p t h r e a d t ∗t i d , p t h r e a d a t t r t ∗a t t r , void ∗s t a r t r o u t i n e , void ∗a r g ) ; i n t SMG thread count ( ) ;

i n t S M G t h r e a d c o u n t a t r a n k (i n t p r o c e s s i d ) ; i n t SMG thread systemwide ( ) ;

ENGINE COMMUNICATION 83

The number of threads that are active in a given process can be obtained using the SMG thread count atrank call. The number of current ’alive’ user application threads within the local process, created using the SMG thread create function, can be ob- tained using the SMG thread count function1. The total number of system threads that are active across all processes at a point in time can be obtained using the SMG thread systemwide function. Changes in the number of user threads in the sys- tem only becomes visible upon a system-wide barrier, so between initialisation and the first global barrier this function will return a value equal to SMG proc size.