• No results found

Chapter 5 Writing a Content Generator

5.3 The Default Handler

So far, we’ve presented simple variants on a simple handler, and highlighted the tools required to develop a content handler equivalent to a normal CGI or PHP script. To conclude this chapter, we’ll present Apache’s default handler. Although it serves a file from the server’s filesystem, this handler differs from our earlier func- tions in that it does quite a lot more housekeeping, illustrating more of the core API. Apache’s default handler is more advanced than the handlers shown in the pre- vious examples, and you may prefer to skip it on a first reading.

static int default_handler(request_rec *r) { conn_rec *c = r->connection; apr_bucket_brigade *bb; apr_bucket *e; core_dir_config *d; int errstatus; apr_file_t *fd = NULL; apr_status_t status; int bld_content_md5;

ap_get_module_configretrieves the module’s configuration (Chapter 9): d = (core_dir_config *)ap_get_module_config(r->per_dir_config,

&core_module);

We can compute an MD5 hash if our system is configured to do so, but only if there isn’t a filter that will transform the contents and invalidate our hash.

bld_content_md5 = (d->content_md5 & 1)

&& r->output_filters->frec->ftype != AP_FTYPE_RESOURCE;

Because this is the handler of last resort, we can’t just return DECLINED if we don’t

want the request.

ap_allow_standard_methods(r, MERGE_ALLOW, M_GET, M_OPTIONS, M_POST, -1);

This next check performs housekeeping tasks. It’s not really necessary, because Apache will perform these tasks for us if unused input remains when it destroys the request.

/* If filters intend to consume the request body, they must * register an InputFilter to slurp the contents of the POST * data from the POST input stream. It no longer exists when * the output filters are invoked by the default handler. */

if ((errstatus = ap_discard_request_body(r)) != OK) { return errstatus;

}

if (r->method_number == M_GET || r->method_number == M_POST) { if (r->finfo.filetype == 0) {

ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r,

"File does not exist: %s", r->filename); return HTTP_NOT_FOUND;

}

This handler serves only normal files; Apache handles directories differently. If a request for a directory reaches this handler, it’s a configuration error.

/* Don't try to serve a directory. Some OSs do weird things * with raw I/O on a directory.

*/

if (r->finfo.filetype == APR_DIR) {

ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r,

"Attempt to serve directory: %s", r->filename); return HTTP_NOT_FOUND;

}

Deal with any extra junk on the end of the request URI.

if ((r->used_path_info != AP_REQ_ACCEPT_PATH_INFO) && r->path_info && *r->path_info)

{

/* default to reject */

ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, "File does not exist: %s",

apr_pstrcat(r->pool, r->filename, r->path_info, NULL));

return HTTP_NOT_FOUND; }

/* We understood the (non-GET) method, but it might not be legal for this particular resource. Check whether the 'deliver_script' flag is set. If so, then go ahead and deliver the file because

it isn't really content (only GET normally returns content).

Note: The only possible non-GET method

at this point is POST. In the future, we should enable script delivery for all methods. */

if (r->method_number != M_GET) { core_request_config *req_cfg;

req_cfg = ap_get_module_config(r->request_config, &core_module);

if (!req_cfg->deliver_script) {

/* The flag hasn't been set for this request. Punt. */ ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r,

"This resource does not accept the %s method.", r->method);

return HTTP_METHOD_NOT_ALLOWED; }

}

if ((status = apr_file_open(&fd, r->filename,APR_READ|APR_BINARY #if APR_HAS_SENDFILE

| ((d->enable_sendfile == ENABLE_SENDFILE_OFF) ? 0 : APR_SENDFILE_ENABLED) #endif

, 0, r->pool)) != APR_SUCCESS) { ap_log_rerror(APLOG_MARK, APLOG_ERR, status, r,

"file permissions deny server access: %s", r->filename); return HTTP_FORBIDDEN;

}

Now we set a few more standard headers:

ap_update_mtime(r, r->finfo.mtime); ap_set_last_modified(r);

ap_set_etag(r);

apr_table_setn(r->headers_out, "Accept-Ranges", "bytes"); ap_set_content_length(r, r->finfo.size);

bb = apr_brigade_create(r->pool, c->bucket_alloc);

ap_meets_conditions carries out some useful checks, cross-referencing the file

information to the request headers to determine whether we really need to send the file or just to confirm the validity of a client’s cached copy. In exceptional circum- stances, it may determine that our file is useless to the client and should be discarded.

if ((errstatus = ap_meets_conditions(r)) != OK) { apr_file_close(fd); r->status = errstatus; } else { if (bld_content_md5) { apr_table_setn(r->headers_out, "Content-MD5", ap_md5digest(r->pool, fd)); }

/* For platforms where the size of the file may be larger * than can be stored in a single bucket (where the * length field is an apr_size_t), split it into several * buckets */

if (sizeof(apr_off_t) > sizeof(apr_size_t) && r->finfo.size > AP_MAX_SENDFILE) { apr_off_t fsize = r->finfo.size;

e = apr_bucket_file_create(fd, 0, AP_MAX_SENDFILE, r->pool, c->bucket_alloc);

while (fsize > AP_MAX_SENDFILE) { apr_bucket *ce; apr_bucket_copy(e, &ce); APR_BRIGADE_INSERT_TAIL(bb, ce); e->start += AP_MAX_SENDFILE; fsize -= AP_MAX_SENDFILE; } e->length = (apr_size_t)fsize;

/* Resize just the last bucket */ } else { e = apr_bucket_file_create(fd, 0, (apr_size_t)r->finfo.size, r->pool, c->bucket_alloc); } #if APR_HAS_MMAP if (d->enable_mmap == ENABLE_MMAP_OFF) { (void)apr_bucket_file_enable_mmap(e, 0); } #endif APR_BRIGADE_INSERT_TAIL(bb, e); } e = apr_bucket_eos_create(c->bucket_alloc); APR_BRIGADE_INSERT_TAIL(bb, e); status = ap_pass_brigade(r->output_filters, bb); if (status == APR_SUCCESS || r->status != HTTP_OK || c->aborted) { return OK; }

else {

/* No way to know what type of error occurred */ ap_log_rerror(APLOG_MARK, APLOG_DEBUG, status, r,

"default_handler: ap_pass_brigade returned %i", status);

return HTTP_INTERNAL_SERVER_ERROR; }

}

else { /* unusual method (not GET or POST) */ if (r->method_number == M_INVALID) {

ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r,

"Invalid method in request %s", r->the_request); return HTTP_NOT_IMPLEMENTED;

}

Another API call supports the OPTIONSmethod: if (r->method_number == M_OPTIONS) { return ap_send_http_options(r); } return HTTP_METHOD_NOT_ALLOWED; } }

5.4 Summary

This chapter dealt with content generators and related topics: • It introduced the Apache module structure.

• It showed how a module can register a handler function with the core. • It described the basic handler API.

• It described the role of content generator modules and developed a simple module.

• It showed how a content generator works with the request_rec object to

obtain information such as headers and environment variables, to perform I/O, and to access form data.

• It demonstrated basic error handling.

• It described basic housekeeping commonly encountered in modules.

• It introduced Apache’s default handler, demonstrating slightly more advanced techniques to serve static files efficiently and with proper attention to the HTTP protocol.

At this point, you should be able to write an application as a module or rewrite a CGI script as a module. While we have introduced the overall structural skeleton of a module, our coverage has been punctuated with several blanks. The remaining parts of the module structure are concerned with configuration; they will be dis- cussed in Chapter 9. The meaning of hooks and their registration are covered in Chapter 10. Next, Chapters 6, 7, and 8 complete our discussion of request handling fundamentals by introducing the request processing cycle, access and authentica- tion, and the filter chain.

Before returning contents to the client, Apache needs to examine the HTTP request with reference to the server configuration. Much of the Apache standard code is concerned with this task, and sometimes we may need to write a new module to support different behavior. Such modules work by hooking into the early parts of request processing, before any content generator is invoked, and sometimes by diverting or aborting the entire request.

In this chapter, we will first review the metadata sent to the server in an HTTP request. We will then see how the standard modules in Apache deal with this in han- dling a request. Finally, we will develop a new module.

6

151

Request Processing Cycle

Note that there is no universally agreed-upon nomenclature here. Modules directly relevant to this chapter are classified into various categories in the Apache distribution: • Mappers (modules that map from a request URL to the internal structure of

the server and/or the filesystem)

• Metadata (modules that explicitly manipulate HTTP headers and/or Apache’s internal state)

• AAA (access, authentication, and authorization modules—the most popular class of metadata modules; discussed in detail in Chapter 7)

This chapter deals with general issues concerning the request processing cycle and metadata handling. Of course, many modules with a different primary purpose (e.g., handlers) may include metadata hooks alongside other functions.

A great deal of folklore has arisen concerning certain uses of metadata and request handling—for example, methods for presenting different types of content to differ- ent visitors. At worst, adhering to these myths can lead to broken reimplementa- tions of standard features (reinventing the wheel, but the new one isn’t round)! Professional developers as well as hobbyists may be guilty of this. This chapter warns you about some of the more common misconceptions.

6.1 HTTP

To discuss HTTP request processing, we first need to understand some basics about the Hypertext Transfer Protocol (HTTP).

6.1.1 The HTTP Protocol

HTTP is one member of a broad family of networking protocols for passing mes- sages, whose roots go back to the early days of the Internet. The oldest of these pro- tocols still in general use today is SMTP, the e-mail standard known as RFC822 that dates from 1983. The protocol of the Web is HTTP, which is specified in RFC1945 (HTTP 1.0) and RFC2616 (HTTP/1.1, the current protocol version— see Appendix C). These protocols share some common overall characteristics, designed for the exchange of messages.