The Apache HTTP server in its most recent version (2.2 as of writing this) can be downloaded in source code from the Apache HTTP Server [111] Website, or pre compiled as binary package from the repository of your favorite Linux distribution.
For the rest of this section we will refer to the Apache documentation for file names. This documentation is usually installed with the Apache binary inside the DocumentRoot. If we cannot reach the local documentation, there still is the official documentation [112] from the Apache Website. We will use a virtual network with Slackware 13.0 [113]
inside VirtualBox [114], which is free (as in cost) and available Free (as in Freedom) with small restrictions.
Distribution specific summaries for Debian Lenny [115] and a clone of Redhat Enterprise [116], Centos 5.4 [117] will follow below.
If we want to compile Apache from source, we use the usual configure, make, make install steps. For further details please refer to the documentation [118] page.
The web server binary httpd itself is usually located in /usr/sbin/. We can use the binary directly to start and stop the web server through command line options, but a better idea is to use the control script apachectl to interface with the httpd. apachectl can control the web server process (start and stop) in a convenient way and sets up the environment and checks the configuration file in the background. Back in the days of the transition from Apache 1.3 to the Apache 2.x series the control script was called apache2ctl to tell it apart from the Apache 1.3 script (then) apachectl.
It is unfortunate that the LPI still refers to apache2ctl while the Apache source code produces apachectl.
[root@lpislack ~]# apachectl
Usage: /usr/sbin/httpd [-D name] [-d directory] [-f file]
[-C "directive"] [-c "directive"]
[-k start|restart|graceful|graceful-stop|stop]
[-v] [-V] [-h] [-l] [-L] [-t] [-S]
Options:
-D name : define a name for use in <IfDefine name> directives -d directory : specify an alternate initial ServerRoot
-f file : specify an alternate ServerConfigFile
-C "directive" : process directive before reading config files -c "directive" : process directive after reading config files -e level : show startup errors of level (see LogLevel) -E file : log startup errors to file
-v : show version number -V : show compile settings
-h : list available command line options (this page) -l : list compiled in modules
-L : list available configuration directives
-t -D DUMP_VHOSTS : show parsed settings (currently only vhost settings) -S : a synonym for -t -D DUMP_VHOSTS
-t -D DUMP_MODULES : show all loaded modules
-M : a synonym for -t -D DUMP_MODULES -t : run syntax check for config files
Hmmm. This does not look right, because if apachectl encounters parameters it does not understand, it passes them directly to httpd. And no parameter is such a parameter, so apachectl invokes httpd without any parameter.
apachectl can start, stop and restart the web server, but even more useful is gracefull and gracefull-stop which restarts/stops the web server while not stopping currently open connections.
configtest does the same as httpd -t in testing the apache configuration file. The options status and fullstatus need the mod_status module to display many useful status informations about our http server.
The logs for your Apache instance go to /var/log/httpd/. The two most important log files are access_log, which logs every access to the web server and error_log which only records errors. Tools like Awstats [119] and Webalizer [120] use the access_log to generate their reports.
A snippet of access_log shows (taken from the Debian Lenny machine lpidebian) the IP 192.162.10.21 accessing / on the website, which is the “welcome” page of the web server (more on this later), then trying to GET /favicon.ico and then /login.html which both result in a “404”, which means
“File does not exist”.
192.168.10.21 - - [02/Jun/2009:17:06:01 -0400] "GET / HTTP/1.1" 200 56
"-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.10) Gecko/2009042315 Firefox/3.0.10"
192.168.10.21 - - [02/Jun/2009:17:17:12 -0400] "GET /favicon.ico HTTP/1.1" 404 300 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.9.0.10) Gecko/2009042315 Firefox/3.0.10"
192.168.10.21 - - [05/Jun/2009:16:41:39 -0400] "GET / HTTP/1.1" 200 56
"-" "Mozilla/5.0 (compatible; Konqueror/3.5; Linux 2.6.27.7-smp) KHTML/3.5.10 (like Gecko)"
192.168.10.21 - - [05/Jun/2009:16:41:39 -0400] "GET /favicon.ico HTTP/1.1" 404 300 "-" "Mozilla/5.0 (compatible; Konqueror/3.5; Linux 2.6.27.7-smp) KHTML/3.5.10 (like Gecko)"
192.168.10.21 - - [05/Jun/2009:16:41:50 -0400] "GET /login.html HTTP/1.1" 404 299 "-" "Mozilla/5.0 (compatible; Konqueror/3.5; Linux 2.6.27.7-smp) KHTML/3.5.10 (like Gecko)"
This snippet from error_log shows the same errors but in greater detail:
[Fri Jun 05 13:41:10 2009] [notice] mod_python: using mutex_directory /tmp
[Fri Jun 05 13:41:11 2009] [notice] Apache/2.2.9 (Debian)
PHP/5.2.6-1+lenny3 with Suhosin-Patch mod_python/3.3.1 Python/2.5.2 mod_perl/2.0.4 Perl/v5.10.0 configured -- resuming normal operations [Fri Jun 05 16:41:39 2009] [error] [client 192.168.10.21] File does not exist: /var/www/favicon.ico
[Fri Jun 05 16:41:50 2009] [error] [client 192.168.10.21] File does not exist: /var/www/login.html
The configuration of Apache takes place in /etc/httpd/httpd.conf. This lengthy, but well documented, configuration file is in part structured similar to a HTML page. To strip out any comments you can easily use grep.
root@lpislack:~# grep -v ^# /etc/httpd/httpd.conf | grep -v ^$ | grep -v "^ #"
ServerRoot "/usr"
Listen 80
LoadModule auth_basic_module lib/httpd/modules/mod_auth_basic.so LoadModule auth_digest_module lib/httpd/modules/mod_auth_digest.so ...
LoadModule log_config_module lib/httpd/modules/mod_log_config.so LoadModule userdir_module lib/httpd/modules/mod_userdir.so
LoadModule alias_module lib/httpd/modules/mod_alias.so LoadModule rewrite_module lib/httpd/modules/mod_rewrite.so User apache
Group apache
ServerAdmin [email protected] DocumentRoot "/srv/httpd/htdocs"
<Directory />
Options FollowSymLinks AllowOverride None Order deny,allow
Deny from all
</Directory>
<Directory "/srv/httpd/htdocs">
Options Indexes FollowSymLinks AllowOverride None
Order allow,deny Allow from all
</Directory>
DirectoryIndex index.html
ErrorLog "/var/log/httpd/error_log"
LogLevel warn
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined CustomLog "/var/log/httpd/access_log" common
ScriptAlias /cgi-bin/ "/srv/httpd/cgi-bin/"
<Directory "/srv/httpd/cgi-bin">
AllowOverride None Options None
Order allow,deny Allow from all
</Directory>
DefaultType text/plain
TypesConfig /etc/httpd/mime.types root@lpislack:~#
This slightly striped down httpd.conf is taken from a Slackware 13.0 system (lpislack. There are two terms to know when talking about httpd.conf: "directives" and "containers". "Directives" are the configuration options (and their values) themselves, while "containers" are directories or collections of files. Any directive inside a container will only be valid inside this container, directives outside the container are of global effect for the whole site. On the other side, there are directives that are only valid inside a container.
ServerRoot "/usr"
This is a tricky on. All relative paths start from here, the absolute ones are, as implied by the name, absolute.
Listen 80
The TCP port the httpd listens for incoming connection requests. If our machine has more than one network address, we can bind the <coce>httpd to one (ore more) IP adresses/port combinations here as well.
LoadModule auth_basic_module lib/httpd/modules/mod_auth_basic.so
Loads the module auth_basic_module located in lib/httpd/modules/mod_auth_basic.so relative to the the ServerRoot, so the whole path to this module is /usr/lib/httpd/modules/mod_auth_basic.so
User apache
The user account httpd runs as. This better be an restricted account. One (the first) httpd process has to run as root, if it wants to claim port 80.
Group apache
The group of the user httpd runs as.
ServerAdmin [email protected]
the e-mail address of the administrator responsible for running the httpd. This shows up when errors occur.
DocumentRoot "/srv/httpd/htdocs"
This is the directory where the actual HTML documents live on your hard drive!
<Directory /> ... </Directory>
This is a container object. All directives inside are only valid for this directory "/" and all of its subdirectories.
Options FollowSymLinks
Potential security risk! Does what its name suggest.
AllowOverride None
You can override most directives with a .htaccess file. This is a security risk and the use of .htascess is denied by this directive.
Order deny,allow
Controls the access to files and directories. First look who is not allowed, then look who is allowed. The default is the last control that matches, if non matches or both match, use the default (=last)!
Deny from all
Denies all hosts the access to all file in this container.
<Directory "/srv/httpd/htdocs"> ... </Directory>
Container for the ServerRoot directory. Note the Order allow,deny and the Allow from all directives. Here we want access from all hosts.
DirectoryIndex index.html
The file with this name is presented to the client when a web browser accesses a directory and not a specific HTML page.
If no index.html exist in this directory the contents of the directory itself is shown. Options Indexes allows this, while Options -Indexes generates an error message instead of listing the directories contents.
ErrorLog "/var/log/httpd/error_log"
Sets the logfile for error messages.
LogLevel warn
Sets the verbosity of the error messages.
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\""
combined
Sets the format of the entries in the custom log file (usually access_log) CustomLog "/var/log/httpd/access_log" common
Sets name and location of the custom log file.
ScriptAlias /cgi-bin/ "/srv/httpd/cgi-bin/"
Directory for CGI scripts.
DefaultType text/plain
Apache uses this MIME type for the HTML pages it provides to the web browser, if the HTML page itself contains no other information.
TypesConfig /etc/httpd/mime.types
List of MIME types to use for different types of file names.