Recall that the final step in the Apache startup process is to spawn or fork one or more child processes. When a child is spawned, it goes into a wait-service-wait loop, where the child waits for a request, services the request, and then goes back into a listening state until its next request is received. Where the main Apache process runs as root, these child processes run as an unpriv- ileged user and group. Doing so greatly increases the security of the server. If an attacker is able to break in through Apache itself, he will have the privileges of that user, rather than the privi- leges of the root account!
■Caution
Just because Apache child processes run as an unprivileged user doesn’t mean that the server is secure. Even if an attacker gains access to the system through this unprivileged user, he might be able to escalate his privileges through a number of attack vectors.As a child receives a request, the request falls through a number of Apache procedures before a response is sent. Some of these procedures provide entry points into the Apache API. Using the Apache API, a module writer can affect the handling of the request at a much lower level than merely responding to the request at the CGI level after it has been passed by the Apache child process. In other words, using the Apache API, a programmer could send special headers back as part of the response, in a much more efficient way than writing a CGI script to accomplish the task.
The main processing for Apache requests is handled in the http_request.c source code file. In this file, a number of procedures are defined, including process_request, which calls process_request_internal. The process_request_internal procedure contains the heart of Apache request handling. The Apache handling procedures are described in Table 10-1.
C H A P T E R 1 0■ A PA C H E A N D M O D _ P E R L 185
Table 10-1. Apache Request Procedures
Procedure Description
location_walk Apache looks at the configuration file for any location directives based on the URI as passed in the request.
translate_name Apache takes the name from the URI and converts it to a name in relation to the local filesystem. This has nothing to do with translation between languages, but rather is how Apache converts a URI to a local file.
directory_walk Now that Apache has converted the URI to a local resource, it examines the configuration file for any directory directives that might apply to this particular resource.
file_walk Apache examines the configuration file to find any file directives that might apply to the requested resource.
location_walk Apache does another round of location walking in the configuration file to see if the translate_nameprocedure has changed the location, thus making the location directive now apply.
header_parse The header of the request is parsed.
check_access A number of authorization checks are done, with check_access
being the first. It checks for access based on the IP of the request.
check_user_id This procedure looks at authorization based on the identity of the remote user.
check_auth This procedure looks at the username and password pair.
find_types This procedure works with MIME types of the requested resource. At this stage, Apache chooses the correct content handler for the requested resource.
run_fixups This procedure is somewhat misnamed. During this phase of the Apache request/response cycle, the response header is written and the content may also be sent to the client. This phase can work in conjunction with the invoke_handlerprocedure, called next.
invoke_handler During this phase, if another module is necessary for fulfilling the request, it is called. This handler may also write the response header and send the content.
finalize_request_protocol This phase performs some cleanup actions on the request but shouldn’t be confused with any cleanup for the Apache child processes.
logging Though not a procedure name, logging may be performed at any step in the process if an error is encountered or when the request is processed.
■Note
The procedures described in Table 10-1 are used in the Apache 1.3 series, specifically from 1.3.33, for anyone keeping score at home. The procedures and the handling of requests are largely the same in Apache 2.C H A P T E R 1 0■ A PA C H E A N D M O D _ P E R L
186
An Apache module, whether written in C or in Perl with the help of mod_perl, can implement these procedures to work with the Apache requests as they go through their various stages. For example, you might write a custom module for authentication, for logging, and so on. You’ll see how to do this in Chapter 11.
Forking
Apache works by forking child processes that go off and handle the actual requests from clients. The Apache configuration file controls how these children work, including the number of child processes to fork, the number to keep around, and how long they should be kept around. Some Apache configuration directives are important in this regard. The following are some of those configuration directives:
• maxrequestsperchild: This directive sets the number of requests that a given child will handle before dying and being replaced by another child. If the web server is serving buggy and/or poorly written programs that have problems like memory leaks, adjusting this value (setting it lower) will help to control that memory leak. However, the trade-off is that Apache will need to spawn another child process each time one dies (in accordance with minspareservers). The spawning of new child processes is not without overhead of its own. In practice, you don’t want to set the maxrequestsperchild value so low that Apache needs to fire up replacement child processes during busy times. The true solution is to fix whatever buggy and leaky programs are causing Apache to use extra memory during a child’s lifetime.
• maxclients: This directive sets the limit for the number of requests that can be serviced at any given time. By default, Apache sets a hard limit on this directive, making the maxi- mum value 256. You can increase this setting by changing the value in the httpd.h header file and recompiling Apache. This directive is key for surviving a heavy load spike. • listenbacklog: This directive sets the length of the queue for pending requests. The
default for this, 511, is normally high enough.
• minspareservers: This directive is used to configure the minimum number of idle chil- dren to have awaiting a request. Keeping a child server around will prevent Apache from needing to spawn another child process if all the children are busy. The default is 5. • maxspareservers: This directive is used to set the maximum number of child processes
to have awaiting a request. The parent process will kill off idle child processes to pre- serve system resources. The default is 10.
• startservers: This directive sets the number of child processes to spawn when starting Apache. The default is 5. Setting this value too high will cause a slowdown in the Apache startup process.
■Caution
Keep in mind that the configuration parameters described here directly affect the performance of the Apache server. Setting these too low or too high can result in significantly decreased performance. I’m reluctant to give recommendations for these settings, since a large number of factors are involved in deter- mining the optimum settings.C H A P T E R 1 0■ A PA C H E A N D M O D _ P E R L 187