• No results found

3.3 HALLOC: A Hierarchical Memory Allocator

3.3.2 Memory Allocation

As we already know, the memory allocation in HALLOC happens at two levels. We

have a wrapper function called scc_malloc which is used by a program when it needs to allocated some storage. Internally scc_malloc calls either the global allocator function called scc_malloc_global or the local allocator called scc_malloc_local. We will start

Next

Chunk Size Owner ID Free Memory

Figure 3.6 A Chunk with Header

with data structure of chunk and then look at the allocators. We do not provide any in-depth details of the local allocator scc_malloc_local as it is a “first fit” algorithm from[71].

Information required in order to manage super-chunk/chunk is kept in the data structure called chunk header. The information in chunk header is arranged in three fields: first is a pointer to next chunk in free list or garbage list; second is size of chunk i.e. the memory which is available for storage; and third is owner-id. We use core-id as owner-id for chunks. The chunk header is followed by the memory area that is available for storage to end user. The pointer returned by allocator points at the storage area, not at the chunk header itself. Chunk is depicted in Figure3.6. Since we know exactly how big the header is, we can always retrieve information needed from the header by employing pointer arithmetic. In order to keep track of free storage available, the local allocator uses the next chunk field from the header to create circular linked list called free list composed of all the chunks. Once a chunk is allocated, it is removed from the free list and the next chunk field is set to NULL.

Algorithm 1 Algorithm scc_malloc_global to Allocate m Bytes from SHM-Arena Require: m≤ SHM-Arena_size ◃ size of un-allocated memory in SHM-Arena

1: tas_lock()

2: super-chunk← SHM-Arena_ptr

3: super-chunk_next← NULL

4: super-chunk_size← m

5: super-chunk_owner-id← core-id

6: SHM-Arena_ptr← SHM-Arena_ptr +(m + header_size)

7: SHM-Arena_size← SHM-Arena_size −(m + header_size)

8: tas_unlock() 9: Return super-chunk

Algorithm1describes the global allocator. It is a very simple allocator that keeps track of the size of the SHM-Arena and the starting point of un-allocated memory in SHM-Arena as meta-data. When a request for memory allocation is made by a core, the global allocator performs three steps. First it checks if there is enough memory to allocate. If there is enough memory available, then it continues to the next step—otherwise it returns an error. Next it uses a lock to avoid any corruption

of meta-data. Finally it allocates the required size of super-chunk with appropriate header information, and then adjusts the size and starting point of the un-allocated memory in SHM-Arena before releasing the lock. This global allocator can be called by multiple cores simultaneously and as such we will need a lock to guard the critical section. We use Test-and-Set (T&S) register available on the SCC to implement this lock. Locks implemented using T&S provide atomicity at the SCC level, which is a mandatory property for the global allocator.

Algorithm 2 Algorithm scc_malloc to Allocate n Bytes from Free List of Chunks

1: mutex_lock()

2: memptr← scc_malloc_local(n) ◃ K&R First-Fit malloc

3: mutex_unlock()

4: if memprt = NULL then

5: scc_free_garbage() ◃ triggers garbage list clean-up

6: mutex_lock()

7: memptr← scc_malloc_local(n)

8: mutex_unlock()

9: if memprt = NULL then

10: super-chunk← scc_malloc_global(m)

11: mutex_lock()

12: scc_free_local(super-chunk) ◃ make it available to local allocator

13: memptr← scc_malloc_local(n)

14: mutex_unlock()

15: if memprt = NULL then

16: Not enough memory available, return Error

17: else

18: Return memptr

Algorithm2allocates n bytes of memory from the free list. The local allocator scc_malloc_local implements a “first fit” allocation strategy. As interested readers can find a detailed description with sample code in[71], we refrain from listing any code here. Initial implementation of scc_malloc_local is not thread-safe. It is therefore protected by a lock (lines 1–3). Since scc_malloc_local is a core local allocator, mutex lock is sufficient enough. The following situations can occur when scc_malloc is called:

Lines 1–3 allocation request is successful and required memory is allocated by scc_malloc_local, no further action is required, scc_malloc returns pointer to allocated memory.

that are already marked for de-allocation waiting in the garbage list. In this case, we trigger garbage list clean-up by invoking scc_free_garbage. Once garbage clean-up is complete, we try again to allocate required memory with scc_malloc_local. In case of success, pointer to allocated memory is returned.

Lines 9–14 Second call to scc_malloc_local also return NULL. In this case we request super-chunk from global allocator scc_malloc_global. Once we have super- chunk before it can be used we have to make it available to scc_malloc_local by adding it to the free list. This is achieved by invoking scc_free_local. we try one more time to allocate required memory with scc_malloc_local. In order to ease the contention on the global allocator, the core always requests m bytes of memory where m> n.

Lines 15–18 Since asking for super-chunk is our last attempt to allocate memory. Failure in this case means there is not enough memory available and the program exists with out of memory error. Pointer to allocated memory is returned in case of a success.

As can be seen from Algorithm2, all the calls to scc_malloc_local and scc_free_local are protected by a mutex from the Portable Operating System Interface [for Unix] (POSIX) thread library to make them thread-safe.