Cross-MPM Programming Issues - Programming Techniques and Caveats

Chapter 4 Programming Techniques and Caveats

4.7 Cross-MPM Programming Issues

As already hinted at, the MPM is really the platform for Apache. Because the APR deals with native platform issues such as the filesystem, the remaining MPM issues are the difficult ones. Principally, we have to deal with the consequences of running single or multiple processes, and implementing single or multiple threads within a process. This is not an “either/or” situation, however: Apache may also run with both multiple processes and multiple threads per process.

We’ve already discussed thread safety in Apache. The other major issue we need to deal with is coordinating between different processes. This coordination is generally expensive, and the types of interprocess interactions we can implement within the context of the standard Apache architecture are limited. Fortunately, such coordination is rarely necessary: While few modules need to concern themselves proac- tively with thread safety or resource management, fewer still need to concern themselves with interprocess issues.

There are two basic requirements you commonly have to consider: • Global locks

• Shared memory

The APR provides Apache with support for both of these requirements.

4.7.1 Process and Global Locks

We’ve seen how using an APR thread mutex protects a critical section of code managing a server-based resource shared between threads. But APR provides two further mutexes: the process mutex apr_proc_mutex and the global mutex apr_global_mutex. When a module updates a globally shared resource (other

than one with its own protection, such as an SQL database, or another server we are merely proxying), we need to use the latter mutex to protect critical sections of code. A case in which such a need often arises is when we are creating or updating files on the server.

The APR global mutex is more complex and more expensive than the thread mutex. The complexity lies in the initial setup of the mutex. First, it must be created in the parent process in the post_configphase. Second, each child has to attach to it in

thechild_initphase:

static int my_post_config(apr_pool_t *pool, apr_pool_t *plog, apr_pool_t *ptemp, server_rec *s)

{

/* Several types of locks are supported; see apr_global_mutex.h * APR_LOCK_DEFAULT selects a lock type considered appropriate * for the platform we are running on.

*/ apr_status_t rc; my_svr_cfg *cfg = ap_get_module_config(s->module_config, &my_module); rc = apr_global_mutex_create(&cfg->mutex, cfg->mutex_name, APR_LOCK_DEFAULT, pool) ; if (rc != APR_SUCCESS) { ap_log_error(APLOG_MARK, APLOG_CRIT, rc, s,

"Parent could not create mutex %s", cfg->mutex_name); return rc; } #ifdef AP_NEED_SET_MUTEX_PERMS rc = unixd_set_global_mutex_perms(cfg->mutex); if (rc != APR_SUCCESS) { ap_log_error(APLOG_MARK, APLOG_CRIT, rc, cfg,

"Parent could not set permissions on global mutex:" " check User and Group directives");

return rc; } #endif apr_pool_cleanup_register(pool, cfg->mutex, (void*)apr_global_mutex_destroy, apr_pool_cleanup_null) ; return OK ; }

static void my_child_init(apr_pool_t *pool, server_rec *s) {

my_svr_cfg *cfg

= ap_get_module_config(s->module_config, &my_module) ; apr_global_mutex_child_init(&cfg->mutex, cfg->mutex_name, pool) ; }

static void my_hooks(apr_pool_t *pool) {

ap_hook_child_init(my_child_init, NULL, NULL, APR_HOOK_MIDDLE); ap_hook_post_config(my_post_config, NULL, NULL, APR_HOOK_MIDDLE) ; ap_hook_handler(my_handler, NULL, NULL, APR_HOOK_MIDDLE) ;

}

Now we’ve shown the two stages of global mutex creation and hooked an additional function: the content generator my_handler. A content generator is the most likely

place in Apache to need a global mutex. Having set up our mutex in the server initialization, we can use it in the same manner as our thread mutex in any of our handlers:

static int my_handler(request_rec *r) {

/* Handler that edits some file on the server */ apr_status_t rv;

my_svr_cfg *cfg;

cfg = ap_get_module_config(r->server->module_config, &my_module);

/* Acquire the mutex */

rv = apr_global_mutex_lock(cfg->mutex); if (rv != APR_SUCCESS) {

ap_log_rerror(APLOG_MARK, APLOG_ERR, rv, r, "my_module: failed to acquire mutex!"); return HTTP_INTERNAL_SERVER_ERROR;

}

/* Register a cleanup, so we don't risk holding the lock * forever if something bad happens to this request */

apr_pool_cleanup_register(r->pool, cfg->mutex, (void*)apr_global_mutex_unlock, apr_pool_cleanup_null);

/* Now perform our file ops while we have the global lock */

/* If everything went OK, we can release the lock right now. * It may be worthwhile if there's much more processing yet to come * before this request is finished.

rv = apr_global_mutex_unlock(cfg->mutex); if ( rv != APR_SUCCESS ) {

ap_log_rerror(APLOG_MARK, APLOG_ERR, rv, r, "my_module: failed to release mutex!"); }

apr_pool_cleanup_kill(r->pool, cfg->mutex, apr_global_mutex_unlock);

/* Further processing that doesn't require the mutex */

return OK; }

4.7.2 Shared Memory

Many applications designers identify a shared resource as a requirement. Sometimes—as in the example case of editing a file—the shared resource has an independent existence. In other cases, the resource is internal to the webserver, as in a situation involving shared memory.

Consider, for example, the cache we examined earlier in this chapter. If our data are worth caching, presumably it’s more expensive to compute them than to maintain a cache. So wouldn’t it be better to share the cache over all processes, rather than duplicate it for every process?

The answer to this question is commonly “no.” Shared memory is computationally expensive and too inflexible for the task of maintaining such a cache without incur- ring much more work. At the most fundamental level, there is no mechanism for memory allocation, and C pointers cannot meaningfully be shared. For all these reasons, you may want to avoid shared memory in your design.

Of course, sometimes you really do need shared memory. As usual, APR provides support for it.

Shared Memory: apr_shm

The APR shared memory module apr_shmserves well to share fixed-size data such

as simple variables or structs comprising data members but no pointers.

Pointers in Shared Memory: apr_rmm

As mentioned earlier, pointers in apr_shmshared memory are meaningless, because

the address space they point to is not shared. It is possible to implement pointers in shared memory by using another APR module, apr_rmm, to manage a block of

memory allocated by apr_shm. As an example, mod_ldapuses this combination to

manage a shared cache with dynamic allocation:

apr_status_t util_ldap_cache_init(apr_pool_t *pool, util_ldap_state_t *st) {

#if APR_HAS_SHARED_MEMORY apr_status_t result; apr_size_t size;

if (st->cache_file) {

/* Remove any existing shm segment with this name. */ apr_shm_remove(st->cache_file, st->pool);

}

size = APR_ALIGN_DEFAULT(st->cache_bytes);

result = apr_shm_create(&st->cache_shm, size, st->cache_file, st->pool);

if (result != APR_SUCCESS) { return result;

}

/* Determine the usable size of the shm segment */ size = apr_shm_size_get(st->cache_shm);

/* This will create an rmm "handler" to get into the shared memory area */ result = apr_rmm_init(&st->cache_rmm, NULL,

apr_shm_baseaddr_get(st->cache_shm), size, st->pool); if (result != APR_SUCCESS) { return result; } #endif

/* OMITTED FOR BREVITY */

/* Register a cleanup on the pool to run apr_rmm_destroy * and apr_shm_destroy when apache exits.

/* More initialization for ldap */

return APR_SUCCESS; }

Now mod_ldap can use the apr_rmm functions (including versions of malloc, calloc,realloc, and free) and obtain pointers in shared memory. However, we

are still working with a fixed-sized block, and our apr_rmmoperations will be sub-

stantially slower than normal apr_pool allocation. Fully Generic Shared Memory

If we wish to implement other APR and Apache data types in shared memory, we might want to implement an APR pool based on our apr_rmm functions. This is

not possible in the APR as it stands, but such a strategy could, in principle, be made to work with modest modifications based on an alternative apr_allocator that

uses the apr_rmmmemory block and functions. Unfortunately, handling errors and

managing pool lifetime are unlikely to be straightforward operations with this approach.

Persistent/Unlimited Shared Resources: apr_dbm and apr_memcache

DBM files are keyed lookup databases, typically based on hashing and fast lookup. They are (usually) held on the filesystem, so they can be used to share arbitrary data between processes. These databases, which represent an alternative to apr_shm/ apr_rmm, are better suited to management of larger shared resources or resources

whose sizes cannot be set in the Apache configuration. They are also persistent, meaning that they will survive a restart of Apache.

Theapr_memcachemodule is functionally similar (though by no means identical)

to apr_dbm, but uses a (possibly remote) memcached2 server instead of the local

filesystem.

In document The Apache Modules Book Application Development with Apache pdf (Page 132-137)