Memory/Resource Management - Managing Persistent Data

Chapter 4 Programming Techniques and Caveats

4.5 Managing Persistent Data

4.5.2 Memory/Resource Management

As we saw in Chapter 3, APR pools provide a full and elegant solution to most resource management problems in Apache. Persistent resources are an exception, however, because they bring up a new problem: Are we leaking memory (or any other resource)? In the preceding code, if cache entries are ever deleted, the APR pool mechanism for managing resources fails us, because the pool lives on. This becomes a bug, which a server administrator will have to work around by limiting

MaxRequestsPerChildto prevent an indefinite leak.

Several approaches are available to deal with this problem.

Garbage Collection

Instead of terminating the entire child, it is more efficient overall just to terminate our own resource from time to time and reclaim any possibly leaked resources. We can do so by tearing down the pool we’ve been using and starting anew. We’ll need to make provision for this in our child_initfunction. In summary:

1. Add pchildto the my_svr_cfgstruct.

2. Add a counter or a timeout to the my_svr_cfgstruct.

3. Now we can clear garbage by winding up the module’s pool, creating a new pool from pchild, and starting again. This activity must, of course, take

place in a critical section, which is why the mutex needs to outlive the pool. Let’s take a look at a function to add garbage collection to our hash example. We call this function whenever an operation might leak, and we maintain a counter so that it does the real work only when it’s got a decent amount of real work to do. Of course, any operation that might leak will be happening under mutex anyway.

static apr_status_t do_garbage(my_server_cfg *svr) {

/* Call this only while we hold the mutex within some hook */

apr_hash_index_t *index; const void *key;

apr_ssize_t klen; my_val_type *val; apr_pool_t *newpool; apr_hash_t *newcache; apr_status_t rv; if (svr->count++ < svr->max_count) { return APR_SUCCESS; }

/* Creating the new pool is actually a very slow leak on pchild */ /* We can avoid this by creating and using a spare pool in place

* of pchild (inefficient but doesn't leak) or, more simply, by * creating and destroying top-level pools.

rv = apr_pool_create(&newpool, svr->pchild); if (rv != APR_SUCCESS) {

return rv ; /* We should also log an error message here */ }

/* Copy current cache entries */ newcache = apr_hash_make(newpool);

/* Deep-copy current entries in our cache */

for (index = apr_hash_first(NULL, svr->cache); index != NULL; index = apr_hash_next(svr->cache)) {

apr_hash_this(svr->cache, &key, &klen, &val);

/* Now we need to deep-copy key and val.

* Of course, we also need an application-specific * deep_copy function.

apr_hash_set(newcache, apr_pstrdup(newpool, key), klen, deep_copy(newpool, val));

}

/* Clean up the old pool. Delete the old hash together * with any hitherto-leaked stuff.

apr_pool_destroy(svr->pool);

/* Reset our data fields */ svr->pool = newpool; svr->cache = newcache; svr->count = 0;

/* All done successfully */ return APR_SUCCESS;

}

Sometimes we can get away with much less. For example, if we have a hash of objects that time out, and re-creating them is not too expensive, we could dispense with copying anything at all. Then the preceding code reduces to the much simpler function shown here:

/* As in the more complex case, this maybe-garbage-collect * must always happen under thread mutex

static apr_status_t do_garbage(my_server_cfg *svr) {

if (svr->count++ >= svr->max_count) {

/* Just clean up everything, including the hash and its contents * along with whatever may have leaked.

apr_pool_clear(svr->pool);

/* Re-initialize the cache and counter */ svr->cache = apr_hash_new(svr->pool); svr->count = 0;

}

/* All done successfully */ return APR_SUCCESS;

}

This is the “clean” alternative to leaking and using MaxRequestsPerChild as a

workaround.

Use of Subpools

A variant on the garbage collection scheme is to use a subpool for every hash entry. With this approach, we can delete the subpool and reclaim resources whenever the entry itself is deleted. Because the subpools themselves incur overhead, this strategy is most likely to be appropriate when the number of resources is modest, but their size and complexity is such that they dominate relative to the overhead associated with the pools themselves.

Given that the subpools are allocated from the main pool, they are themselves a resource that needs to be managed and a potential source of memory leaks. Subpools offer a partial solution to the problem, but should be used in conjunc- tion with one of the other solutions—for example, clearing and reusing the subpools.

Reuse of Resources

When the objects we are managing are of fixed size, we can manage the memory ourselves within the module:

• We can allocate an array of objects, together with an indexing array of free/in- use flags.

• When we need an object, we can claim it from the array. When we’ve finished with it, we can mark it as “free.”

We can use this strategy with variable-sized objects by using subpools and managing the subpools themselves as the fixed-sized objects in the array. When we finish with an object, we run apr_pool_clear, but keep the pool itself for reuse.

Use of a Reslist

Theapr_reslistserves to manage a pool of resources for reuse, providing a fully

managed solution for us. It is most appropriate where the resources themselves carry a high cost. mod_dbd(see Chapter 10) is a usage example. For a case like our cache

example, we could either use a reslist of subpools or manage blocks of memory and thereby avoid any dynamic allocation.

In document The Apache Modules Book Application Development with Apache pdf (Page 127-130)