Chapter 5 Writing a Content Generator
5.2 The Request, the Response, and the Environment
5.2.1 Module I/O
Our HelloWorldmodule generates output using a stdio-like family of functions: ap_rputc,ap_rputs,ap_rwrite,ap_rvputs,ap_vrprintf,ap_rprintf, and ap_rflush. We have also seen the “send file” call ap_send_file. This simple,
high-level API was inherited originally from earlier Apache versions, and it remains suitable for many content generators. It is defined in http_protocol.h.
Since the introduction of the filter chain, the underlying mechanism for generating output has been based on buckets and brigades, as discussed in Chapters 3 and 8. Filter modules employ different mechanisms for generating output, and these are also available to—and sometimes appropriate for—a content handler.
There are two fundamentally different ways to process or generate output in a filter: • Direct manipulation of bucket and brigades
• Use of another stdio-like API (which is a better option than the ap_r*API,
as backward compatibility isn’t an issue)
We will describe these mechanisms in detail in Chapter 8. For now, we will look at the basic mechanics of using the filter-oriented I/O in a content generator.
There are three steps to using filter I/O for output: 1. Create a bucket brigade.
2. Populate the brigade with the data we are writing.
3. Pass the brigade to the first output filter on the stack (r->output_filters).
These steps can be repeated as many times as needed, either by creating a new brigade or by reusing a single brigade. If a response is large and/or slow to gener- ate, we may want to pass it down the filter chain in smaller chunks. The response can then be passed through the filters and to the client in chunks, giving us an effi- cient pipeline and avoiding the overhead of buffering the entire response. Working properly with the pipeline whenever possible is an extremely useful goal for filter modules.
For our HelloWorld module, all we need to do is to create the brigade and then
replace the ap_r* family calls with the alternative stdio-like API defined in util_filter.h:ap_fflush,ap_fwrite,ap_fputs,ap_fputc,ap_fputstrs,
andap_fprintf. These calls have a slightly different prototype: Instead of passing request_recas a file descriptor, we have to pass both the destination filter we are
writing to and the bucket brigade. We’ll see examples of this scheme in Chapter 8.
5.2.1.1 Output
Here is our first trivial HelloWorld handler using filter-oriented output. This
lower-level API is a little more complex than the simple stdio-like buffered I/O,
and it may sometimes enable optimizations of the module (though in this instance, any difference will be negligible). We can also take advantage of slightly finer con- trol by explicitly processing output errors.
static int helloworld_handler(request_rec *r) {
static const char *const helloworld =
"<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN\">\n" "<html><head><title>Apache HelloWorld Module</title></head>" "<body><h1>Hello World!</h1>"
"<p>This is the Apache HelloWorld module!</p>" "</body></html>";
apr_status_t rv;
apr_bucket_brigade *bb; apr_bucket *b;
if (!r->handler || strcmp(r->handler, "helloworld")) { return DECLINED; } if (r->method_number != M_GET) { return HTTP_METHOD_NOT_ALLOWED; } bb = apr_brigade_create(r->pool, r->connection->bucket_alloc); ap_set_content_type(r, "text/html;charset=ascii");
/* We could instead use the stdio-like filter API calls like * ap_fputs(r->filters_out, bb, helloworld);
* which is basically the same as using ap_rputs and family. *
* Alternatively, we can wrap our output in a bucket, append an * EOS, and pass it down the filter chain.
*/ b = apr_bucket_immortal_create(helloworld, strlen(helloworld), bb->bucket_alloc); APR_BRIGADE_INSERT_TAIL(bb, b); APR_BRIGADE_INSERT_TAIL(bb, apr_bucket_eos_create(bb->bucket_alloc)); rv = ap_pass_brigade(r->filters_out, bb); if (rv != APR_SUCCESS) {
ap_log_rerror(APLOG_MARK, APLOG_ERR, rv, r, "Output Error"); return HTTP_INTERNAL_SERVER_ERROR;
}
return OK; }
5.2.1.2 Input
Module input is slightly different. Once again, we have at our disposal a legacy method inherited from Apache 1.x, but it is now treated as deprecated by most developers (although the method is still supported). In most cases, we would prefer to use the input filter chain directly in new code:
1. Create a bucket brigade.
2. Pull data into the brigade from the first input filter (r->input_filters).
3. Read the data in our buckets, and use it.
Both input methods are commonly found in existing modules, including modules for Apache 2.x. Let’s introduce each in turn into our HelloWorld module. We’ll
update the module to support POSTs and count the number of bytes POSTed (note
that this operation will usually—but not always—be available in a Content-Length request header). We won’t decode or display the actual data; although we could do so, this task is usually best handled by an input filter (or by a library such as
libapreq). The functions we use here are documented in http_protocol.h: #define BUFLEN 8192
static int check_postdata_old_method(request_rec *r) {
char buf[BUFLEN];
size_t bytes, count = 0;
/* Decide how to treat input */
if (ap_setup_client_block(r, REQUEST_CHUNKED_DECHUNK) != OK) {
ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, "Bad request body!"); ap_rputs("<p>Bad request body.</p>\n", r);
return HTTP_BAD_REQUEST; }
if (ap_should_client_block(r)) {
for (bytes = ap_get_client_block(r, buf, BUFLEN); bytes > 0; bytes = ap_get_client_block(r, buf, BUFLEN)) {
count += bytes; }
ap_rprintf(r, "<p>Got %d bytes of request body data.</p>\n", count);
} else {
ap_rputs("<p>No request body.</p>\n", r); }
return OK; }
static int helloworld_handler(request_rec *r) {
if (!r->handler || strcmp(r->handler, "helloworld")) { return DECLINED;
}
/* We could be just slightly sloppy and drop this altogether, * but it's good practice to reject anything that's not explicitly * allowed. It cuts off *potential* exploits for someone trying * to compromise the server.
*/
if ((r->method_number != M_GET) && (r->method_number != M_POST)) { return HTTP_METHOD_NOT_ALLOWED;
}
ap_set_content_type(r, "text/html;charset=ascii");
ap_rputs("<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN\">\n" "<html><head><title>Apache HelloWorld Module</title></head>"
"<body><h1>Hello World!</h1>"
"<p>This is the Apache HelloWorld module!</p>", r);
/* Print the tables */
printtable(r, r->headers_in, "Request Headers", "Header", "Value"); printtable(r, r->headers_out, "Response Headers", "Header", "Value"); printtable(r, r->subprocess_env, "Environment", "Variable", "Value");
/* Ignore the return value -– it's too late to bail out now * even if there's an error
*/
check_postdata_old_method(r);
ap_rputs("</body></html>", r); return OK ;
}
Here, finally, is check_postdatausing the preferred method of direct access to the
input filters, using functions documented in util_filter.h.
We create a brigade and then loop until EOS, filling the brigade from the input fil-
ters. We will see this technique again in Chapter 8.
static int check_postdata_new_method(request_rec *r) {
apr_status_t status; int end = 0;
apr_size_t bytes, count = 0; const char *buf;
apr_bucket *b;
apr_bucket_brigade *bb;
/* Check whether there's any input to read. A client can tell * us that fact by using Content-Length or Transfer-Encoding. */
int has_input = 0;
const char *hdr = apr_table_get(r->headers_in, "Content-Length"); if (hdr) { has_input = 1; } hdr = apr_table_get(r->headers_in, "Transfer-Encoding"); if (hdr) { if (strcasecmp(hdr, "chunked") == 0) { has_input = 1; } else {
ap_rprintf(r, "<p>Unsupported Transfer Encoding: %s</p>", ap_escape_html(r->pool, hdr));
return OK; /* we allow this, but just refuse to handle it */ }
}
if (!has_input) {
ap_rputs("<p>No request body.</p>\n", r); return OK;
}
/* OK, we have some input data. Now read and count it. */ /* Create a brigade to put the data into. */
bb = apr_brigade_create(r->pool, r->connection->bucket_alloc); /* Loop until we get an EOS on the input */
do {
/* Read a chunk of input into bb */
status = ap_get_brigade(r->input_filters, bb, AP_MODE_READBYTES, APR_BLOCK_READ, BUFLEN);
if ( status == APR_SUCCESS ) {
/* Loop over the contents of bb */ for (b = APR_BRIGADE_FIRST(bb);
b != APR_BRIGADE_SENTINEL(bb); b = APR_BUCKET_NEXT(b) ) { /* Check for EOS */
if (APR_BUCKET_IS_EOS(b)) { end = 1;
break; }
/* Ignore other metadata */ else if (APR_BUCKET_IS_METADATA(b)) {
continue; }
/* To get the actual length, we need to read the data */ bytes = BUFLEN;
status = apr_bucket_read(b, &buf, &bytes, APR_BLOCK_READ);
count += bytes; }
}
/* Discard data we're finished with */ apr_brigade_cleanup(bb);
} while (!end && (status == APR_SUCCESS));
if (status == APR_SUCCESS) {
ap_rprintf(r, "<p>Got %d bytes of request body data.</p>\n", count);
return OK; }
else {
ap_rputs("<p>Error reading request body.</p>", r);
return OK; /* Just send the above message and ignore the data */ }
}
5.2.1.3 I/O Errors
What happens when we get an I/O error?
Filters (covered in Chapter 8) indicate an error to us by returning an APR error code; they may also set r->status. Our handler can detect such an event, as in the
preceding examples, by checking the return values from ap_pass_brigade and ap_get_brigade. Normal behavior is to stop processing and return an appropri-
ate HTTP error code. This behavior causes Apache to send an error document (dis- cussed in Chapter 6) to the client. We should also log an error message, thereby helping the systems administrator diagnose the problem.
But what if the error was that the client connection was terminated? It’s a waste of time trying to send an error document to a client that’s gone away. We can detect this disconnection by checking r->connection->aborted, as demonstrated in
the default handler found at the end of this chapter.