Local Space Synchronisation - Local Tuple Space

4.3 Basic Components

4.3.2 Local Tuple Space

4.3.2.2 Local Space Synchronisation

In the context of the local tuple storage data structure described above, we should also explain how synchronisation is handled locally, within the partition of the tuple space maintained by each individual node.

One of the core functions utilised both during tuple storage and retrieval is

String generateKey(Tuple t), which is used to generate a key value for the

hash table used for tuple storage. The full type signature of the structure isjava.util.Hashtable<String, Vetor<Tuple, whereby each of these keys are associated with a vector of tuples,

as explained in the previous section. The key generated by the function is based on the first three fields of array tuples, and only the first fields of all others (typically application-specific meta-tuples or those used in system control).

Now, when threads access this data structure, whether it is to retrieve or store a tuple, the accesses will need to be synchronised in such a way as to allow as high a level of concurrency as possible, without allowing deadlock or data inconsistencies to occur. Consider the following code snippet, which shows the steps involve in performing anout()operation:

publi lass TupleSpaeImpl implements TupleSpae {

private Hashtable<String, Vetor<Tuple>> tuples;

// other variable delarations, onstrutors omitted.

publi void out(Tuple t) {

Vetor<Tuple> vals = tuples.get(generateKey(t));

if(vals == null) {

vals = new Vetor<Tuple>();

tuples.put(generateKey(t), vals);

}

synhronized(vals) {

vals.add(0, t);

vals.notifyAll();

}

synhronized(tuples) {

tuples.notifyAll();

}

// other method definitions...

}

1. Attempt to retrieve the vector of tuples associated with the given tuple’s key.

2. If there is no existing associated vector, then we create one and place the tuple to be written into it.

3. The accessor thread will now obtain the object lock for the associated vector, add the tuple, and finally notify all other waiting threads before releasing the lock.

4. Finally, we notify all threads which may be waiting to obtain the lock for the

tupleshash table. This is required for threads which may be blocking during

anin()orrd()operation.

The synchronisation involved in theout()method is, for the most part, required

so that other threads which have blocked while performing a blocking retrieval operation may be woken. It may be the case that the tuple being written to the space provides a match for a tuple being requested, and so all blocking threads must be notified when theout()operation completes.

TheoutAll()operation is for all intents and purposes identical toout(), how-

ever it performs anotifyAll()for each vector “bucket” a tuple is placed into.

Next we cover the tuple retrieval operations, all of which are generalised into two methods:findTuple() (for retrieving a single tuple), andfindAllTuples()

(for bulk retrieval).

Firstly, the relevant code segment forfindTuple()is listed below:

private Tuple findTuple(TupleTemplate t,

boolean remove,

boolean blok)

{

Vetor<Tuple> vals = tuples.get(generateKey(t));

while(vals == null) {

if(blok) {

synhronized(tuples) {

try {

tuples.wait();

} ath(InterruptedExeption e) {

return null; } } } else { return null; }

}

synhronized(vals) {

Tuple result = null;

for(;;) {

for(Tuple i: vals) {

if(t.mathes(i)) {

result = i;

if(remove)

vals.remove(result);

return result;

}

if(blok) {

try { vals.wait(1000); }

ath(InterruptedExeption e) {

return null; } } else { return result; } } } }

The semantics of this method are such that it will return the first tuple found which matches the given template; if theremoveparameter is true, then the match-

ing tuple will also be removed from the space, otherwise only a copy will be returned. If a matching tuple is not found, then what happens depends on the blok

parameter; if true, then the accessor thread will block until a matching tuple be- comes available, otherwise, a null value will be returned. If a blocking thread is

interrupted at any time, anullvalue will also be returned.

Synchronisation constructs are used in two instances in this method. Firstly, in the case where there is no associated vector for the given template and blocking behaviour has been specified, the accessing thread will synchronise on thetuples

object and block until notified by another thread performing a tuple production operation. At this point it will attempt again to obtain a reference to the associated

vector.

The second use of synchronisation in this method occurs when a reference to an associated “bucket” vector has been obtained, and must be searched for a matching tuple. All searches are synchronised on the vector itself, and once again, if no matching tuples are found then the thread will wait on the vector object until notified by another thread, if blocking has been specified.

The final operations which should be mentioned are those which are used for

bulk tuple retrieval. Both theinAll()andrdAll()methods utilise thefindAllTuples()

method, which performs a search of potentially the entire tuple space, retrieving a specified number of matching tuples. Each of these methods utlilise the

findAllTuples() method, which searches the local partition of the space and re-

turns between zero and a specified number of tuples (specified by the expeted

method parameter).

In summary, the synchronisation constructs used within each partition of the tuple space are used minimally, due to our desire to achieve the maximum level of concurrent access possible, and it is the access to each vector “bucket” being serialised. However, different “buckets” are able to be accessed concurrently by different threads and/or remote nodes. In practice and in the context of the applications being presented, this means that threads reading from or writing to the same array elements will serialise their accesses. All other array elements may be accessed concurrently. The other main usage of synchronisation is for the notification of blocking threads, which is obviously required.

A final important consideration is how these synchronisation mechanisms af- fect the application programmer. It is desirable for the synchronisation described above to be as transparent as possible, so that the distribution of the space and the parallelism of the application being implemented is able to be expressed implic- itly rather than have to be explicitly specified in the application code. However, as we can see from the description of the local tuple space synchronisation constructs above, access to each individual array element or tuple in the space is strictly serialised. Whilst we wish to avoid serialised access wherever possible, it is important to note that, for the applications for which this system is intended, quite often each array element will also incorporate a timestep value. As such, very rarely will it be the case that an element will need to be modified; the vast majority of the time these elements will be treated as read-only, with subsequently produced new values being tagged with an incremented timestep value. Therefore, we believe the impact on the application programmer to be minimal, in keeping with the goals of this research.

In document Tupleware: a distributed tuple space for the development and execution of array based applications in a cluster computing environment (Page 64-68)