How To Build the Apache of Your Dreams

The best part about Apache is that you can custom build it to include exactly what you need. The default configuration is a good one, but its far-too-general nature is, by definition, not the best choice for the majority of installations. With a host of plug-in modules available for free over the Internet, customizing Apache to its fullest extent is not only fast and easy, but well worth the time spent.

Apache began as a series of patches the National Center for Supercomputing Application’s httpd. After httpd’s lead developer left NCSA and active development of httpd began to stagnate, programmers from around the world found they needed a central repository to maintain the body of code and patches that had accumulated. A group of these webmasters banded together, and, with machines and bandwidth donated by HotWired, set up an informal coalition to direct the the development of this new server. Dubbed Apache (“a patchy server”), it quickly became the most popular server on the Net after it’s birth in April 1995, at version 0.6.5.

Today, with over 60% of the webserver market share according to Netcraft, Apache is a shining example of how a well-planned and well-implemented piece of (free) software can be, far and away, the best application of its type — even better than high-priced commercial alternatives.

{mospagebreak title=The Modular Design of Apache} Apache was designed from scratch to be modular; that is, the original programmers assumed that it would be extended by other developers, who would write small pieces of code which could be integrated into Apache with ease. They did this by creating a modular API and a well-defined series of phases that every request went through, so that customizing a particular aspect of Apache is often as simple as stringing together API methods that would be run during a particular phase of the request. These phases included everything from server initialization itself (when Apache reads its configuration files), to translating a requested URL into a filename on the server, to logging the results of the transaction, and everything in between.

Developers were quick to respond, and to date there are hundreds of Apache modules available. Many of them are registered with the Apache project, and can be found at modules.apache.org. Chances are pretty good that if there is something you need, someone else has also needed it in the past, and written it. The important question, of course, is how to take advantage of these great resources.

Apache’s modularity can potentially make configuration complicated. By default, Apache ships with a number of useful modules, and the most generally useful of these are enabled by default. Compiling Apache as it is distributed will give you a highly functional, and very flexible, web server capable of handling most of the needs of a general purpose web site.

I know of very few general purpose web sites, however, and they are mostly ISPs. While it is a good starting point, the generic Apache configuration is probably not optimal for you. A little knowledge of the standard modules and what they do can make for a faster, more secure web server, simpler configuration files, and a host of new and exciting features.

{mospagebreak title=Building Apache} Get the latest copy of the Apache source code from http://www.apache.org/dist/. The current version as of this writing is 2.0.48. Download the source to a suitable place, e.g., /tmp.

# cd /tmp # gunzip -c apache_1.3.12.tar.gz | tar xf – # cd apache_1.3.12/
Once we have the source gunzipped and de-tarred, we are ready to begin.

Building Apache is a four part process. Step one is deciding how we want Apache to be configured; step two is invoking the compiler; step three is testing the resulting compiled binary; and step four is putting everything in place. Luckily, we have make to speed us along with the last three steps (make, make test, and make install). Configuration, however, requires some forethought and knowledge, however rudimentary, of what type of web site the server is going to be used for. To do this, we’ll use the configure script and pass options to define how Apache is to be built.

{mospagebreak title=Module Definitions and Groupings} Modules can be grouped into a few major groups: Access Control, Authentication, and Authorization; Header-generating and -controlling modules; Content generating (or filtering) modules; and Utility modules. The descriptions come from the Apache Documentation, but the groups are mine (you’ll probably never see anything like this in the official docs, but I find it easier to categorize them this way). Items in bold are enabled by default.

Content Generating and Related Modules

  • mod_autoindex provides for automatic directory indexing.
  • mod_actions lets you run CGI scripts whenever a file of a certain type is requested. This makes it much easier to execute scripts that process files.
  • mod_dir provides for “trailing slash” redirects and serving directory index files.
  • mod_cgi provides for execution of CGI scripts.
  • mod_imap provides for .map files, replacing the functionality of the imagemap CGI program.
  • mod_include provides for server-parsed html documents.
  • mod_isapi provides support for ISAPI Extensions when running under Microsoft Windows.
  • mod_mime provides for determining the types of files from the filename.
  • mod_mime_magic attempts to determine the MIME type of a file by looking at a few bytes of its contents, the same way the Unix file(1) command works.
  • mod_mmap_static provides mmap()ing of a statically configured list of frequently requested but not changed files.
  • mod_speling [sic] attempts to correct misspellings of URLs that users might have entered, by ignoring capitalization and by allowing up to one misspelling.
  • mod_status allows a server administrator to find out how well their server is performing.
  • mod_userdir provides for user-specific directories, in the form http://www.server.com/~user/.

Access Control, Authentication, and Authorization Modules

  • mod_access provides access control based on client hostname or IP address.
  • mod_auth provides for user authentication using textual files.
  • mod_auth_anon allows “anonymous” user access to authenticated areas.
  • mod_auth_db provides for user authentication using Berkeley DB files.
  • mod_auth_dbm provides for user authentication using DBM files.
  • mod_digest provides for user authentication using MD5 Digest Authentication. It is replaced by mod_auth_digest.
  • mod_auth_digest provides for user authentication using MD5 Digest Authentication. It is an updated, though untested, version of mod_digest, with many more options.

Utility Modules

  • mod_alias provides for mapping different parts of the host filesystem in the the document tree, and for URL redirection.
  • mod_info provides a comprehensive overview of the server configuration including all installed modules and directives in the configuration files.
  • mod_log_config provides for logging of the requests made to the server, using the Common Log Format or a user-specified format.
  • mod_proxy provides for an HTTP 1.0 caching proxy server.
  • mod_rewrite provides a rule-based rewriting engine to rewrite requested URLs on the fly.
  • mod_so provides for loading of executable code and modules into the server at start-up or restart time. On Unix, the loaded code typically comes from shared object files (usually with .so extension), whilst on Windows this module loads DLL files.
  • mod_vhost_alias provides support for dynamically configured mass virtual hosting.

Header Generating Modules and Header Control

  • mod_asis provides for .asis files. .asis files have headers prepended to the content and are sent, well, as is.
  • mod_cern_meta provides for CERN httpd metafile semantics
  • mod_env provides for passing environment variables to CGI/SSI scripts.
  • mod_expires provides for the generation of Expires headers according to user-specified criteria.
  • mod_headers allows for the customization of HTTP response headers. Headers can be merged, replaced or removed.
  • mod_negotiation provides for content negotiation based on, for example, language preference or browser type.
  • mod_setenvif provides for the ability to set environment variables based upon attributes of the request.
  • mod_unique_id provides a magic token for each request which is guaranteed to be unique across “all” requests under very specific conditions.
  • mod_usertrack generates a ‘clickstream’ log of user activity on a site using cookies.


{mospagebreak title=Diversion: Shared Modules (mod_so)} mod_so is a special case, and deserves special treatment, since it may affect how many of the other modules are built. mod_so enables modules to be linked into Apache at run time, rather than at compile time (i.e., when httpd is started). This is similar to how shared libraries operate under Unixen (or DLL’s under Win32). Normally, compiling Apache produces a single, self-contained executable named httpd, but with mod_so, modules can be compiled as shared object files. These are kept separate from the httpd binary, and are activated in configuration files. This allows, for example, several copies of Apache (bound to different interfaces, or on different boxes sharing a common NFS share) to run differently configured httpd processes while the administator only has to maintain one Apache install.

Up to Apache 1.3, mod_so was unstable, but at this point it is considered solid and can be used in production environments. However, there are some moules that perform differently or have different capabilities if built shared. For example, if mod_perl is not compiled staticly, it will lack the Perl SSI capabilities, because mod_include required defines that are only defined if mod_perl is not compiled shared. For the most part, however, standard modules behave identically regardless of whether they are built shared or statically. In fact, some folks prefer to compile mod_so statically, and have everything else as shared modules.

–enable-shared is used in conjunction with mod_so, and allows specific modules to be built as shared modules so they can be loaded into the httpd binary. If mod_so is built into the server (–enable-module=so), any module that is to be built in this way should be mentioned in an enable-shared parameter. For example compiling the Proxy and Rewrite modules this way would require –enable-module=proxy –enable-module=rewrite –enable-shared=proxy –enable-shared=rewrite to be passed to the configure script. Note that both –enable-module and –enable-shared are required.

Adding shared modules after the main job of configuring and compiling the server is completed is made simple by one of the included utility programs, called apxs. This script takes one or more of a number of options and invokes your compiler to compile the module, and optionally put it into place and activate it in the server config file. Enabling mod_so is particularly useful for developers who are writing modules in C, since modules are kept separate (as .so files on Unix or .dll files on Win32) and updating a module is as simple as recompiling the shared module with the included tool apxs and restarting the server. Without mod_so and shared modules, any change, however small, to a C source file requires a compelte recompilation of Apache.

When using mod_so and apxs, if your copy of Perl is in a non-standard location (something other than /usr/bin/perl or /usr/local/bin/perl), you should use the command –with-perl=/path/to/perl, since apxs is a Perl script and needs to point to the correct Perl.

{mospagebreak title=Configuration Methods, and the Modules Who Love Them} Up to Apache 1.3, configuration was handled by editing a file called Configuration, which contained directives that were used to generate Makefiles. As of Apache 1.3, however, there is another option for creating Makefiles, based on GNU autoconf. This new method is called the APache AutoConf-style Interface (APACI), and is much simpler to use; the script is called configure. The syntax of the configure script, in condensed form, can be obtained by typing ./configure –help. (Since configuration is usually done as root, and “.” should never be in root’s path, all example of using configure will be referred to as “./configure”.)

Configuration via the configure script is controlled by passing command-line options. The main options that will be used to create the Makefile are –enable-module/–disable-module, –enable-shared, and –with-layout.

The –enable-module switch is used to enable one of the standard modules. This switch is only necessary for enabling modules that are not enabled by default. Similarly, the –disable-module switch is for disabling modules that are enabled by default but are not desired or needed.

Using –enable-module or –disable-module requires the name of the module, without the initial mod_, so, for example to enable the Proxy module, you would pass –enable-module=proxy, and disabling the AutoIndexing module requires passing –disable-module=autoindex.

{mospagebreak title=Diversion: Layouts} Using –with-layout you can easily indicate where should Apache go after it’s compiled. Locations for binaries, man pages, configuration files, and the like can be set using specific options to the configure script, or they can be defined all at once in a layout. Layouts are defined in the config.layout file, in the root of the Apache source. There are several layouts prefined, including Mac OS X server, BeOS, a typical GNU layout, and others. We’ll use the layout named Apache as an example, since it’s the default; if a layout is not specified, configure uses the Apache layout as a base, and any command l ine options overrride individual directives. (The comments on the right hand side of this table indicate the command line options that can be used for overrides.)

<Layout Apache> prefix: /usr/local/apache # –prefix=DIR exec_prefix: $prefix # –exec-prefix=DIR bindir: $exec_prefix/bin # –bindir=DIR sbindir: $exec_prefix/bin # –sbindir=DIR libexecdir: $exec_prefix/libexec # –libexecdir=DIR mandir: $prefix/man # –mandir=DIR sysconfdir: $prefix/conf # –sysconfdir=DIR datadir: $prefix # –datadir=DIR iconsdir: $datadir/icons # –iconsdir=DIR htdocsdir: $datadir/htdocs # –htdocsdir=DIR cgidir: $datadir/cgi-bin # –cgidir=DIR includedir: $prefix/include # –includedir=DIR localstatedir: $prefix # –localstatedir=DIR runtimedir: $localstatedir/logs # –runtimedir=DIR logfiledir: $localstatedir/logs # –logfiledir=DIR proxycachedir: $localstatedir/proxy # –proxycachedir=DIR </Layout>
These sections define where the various pieces of the final package will go. The whole thing will based in /usr/local/apache, with all binaries (httpd and the support programs) in /usr/local/apache/bin, man pages based in /usr/local/apache/man, configuration files in /usr/local/apache/conf, and so on. Changes to the layout can be made to one of the existing <Layout> sections, or a new one can be added by copying and pasting.

Here is the layout similar to the one that I generally use:

<Layout MyLayout> prefix: /usr/local/apache exec_prefix: $prefix bindir: $exec_prefix/bin sbindir: $exec_prefix/bin libexecdir: $exec_prefix/libexec mandir: /usr/local/man sysconfdir: /etc/apache datadir: $prefix iconsdir: $datadir/icons htdocsdir: $datadir/html cgidir: $datadir/cgi-bin includedir: /usr/include/apache localstatedir: /var/apache runtimedir: $localstatedir/logs logfiledir: $localstatedir/logs proxycachedir: $localstatedir/proxy </Layout>
This is very similar to the Apache layout, except for a few things: man pages go into /usr/local/man, so they can be easily retrieved with a regular call to ‘man httpd'; include files go into /usr/include/apache, so that they can be more easily used when I write to the Apache API, and configuration files go into /etc/apache. Logs, proxy stuff, and things like PID files go into /var/apache; I mount a separate partition as /var so my root filesystem doesn’t fill up with logs, which makes it an ideal place for webserver logs.

{mospagebreak title=Building Apache, Really} OK, enough background; let’s compile Apache. The actual build process is relatively speedy, for all of the planning that goes into it. Compilation happens in a series of steps, where each module is compiled separately, turned into libraries, and then all those libraries get linked together into a single executable, except when using mod_so, inc which case this step is skipped for modules that are to remain shared.

Case One: ISP (www.foo.isp)

The simplest case is that of the default configuration, for example, something an ISP might use. Since we can accept all the default modules, configuration is a matter of:

# ./configure
That’s it. It doesn’t get much easier than this. configure is nice, and warns you about using the default configuration, but in this case, it’s what we want so we can ignore it, and proceed with make, make test, and make install. Our completed binary looks like this:

# /usr/local/apache/bin/httpd -l Compiled-in modules: http_core.c mod_env.c mod_log_config.c mod_mime.c mod_negotiation.c mod_status.c mod_include.c mod_autoindex.c mod_dir.c mod_cgi.c mod_asis.c mod_imap.c mod_actions.c mod_userdir.c mod_alias.c mod_access.c mod_auth.c mod_setenvif.c

Case Two: Corporate Web Site (www.content-heaven.com)

For this example, we are going to build a copy of Apache for a well-designed commercial website (by well-designed, I mean that we have complete control over what types of files will go on it).

We have no imagemaps or asis files, so we can disable mod_imap and mod_asis. There are no user directories on it, and all directories have index files, so we can disable mod_userdir and mod_autoindex. And, finally, none of our pages require any sort of authentication, so can can disable mod_auth (the other mod_auth_* modules are not compiled in by default). We will keep mod_access, however, to protect our server-status page.

We need mod_status so we can keep track of the status of the server, and mod_access to limit access to that page to our domain only (for internal usage). mod_dir lets us specify that each directory has a default index file of index.shtml. (Using mod_actions, we have defined files with a .shtml extension to be handled by mod_include, which means that the web server will parse them for special processing directives, which it will execute. We have also, through mod_dir, told Apache to serve a file called index.shtml whenever someone requests a directory, i.e., a URL that ends with a /.) These are all enabled by default, and require no extra enable-module directives. Since our marketing department saw fit to publish mixed-case URLs in our advertisements, we will need mod_speling, which makes URLs case insensitive (–enable-module=speling).

Don’t forget that when specifying modules to enable or disable, you need to list the name of the module, without the “mod_” prefix.

The standard Apache layout is almost exactly what we need, except for one thing, we would like log files to go into our NFS mounted log directory, /logs/httpd. We can accomplish this by passing –logfiledir=/logs/httpd to the configure script.

Here is what we will type at the command line:

# cd /usr/local/src/apache_1.3.12 # ./configure –with-layout=Apache –logfiledir=/logs/httpd –enable-module=speling –disable-module=imap –disable-module=asis –disable-module=userdir –disable-module=autoindex –disable-module=auth –verbose
Apache will store this in a file called config.status in the root of the source tree (where the configure script lives), so the build can be duplicated easily (it is informative to look at this file, to see what configure thinks you meant). After this finishes running, you will get your prompt back; type make and watch the messages fly across the screen. Once the compiling is completed (again, when you get your prompt back), make install will put the files into the directories specified by the layout chosen (this may require root access to the machine, depending on where the files are going).

Our finished binary looks like:

# /usr/local/apache/bin/httpd -l Compiled-in modules: http_core.c mod_env.c mod_log_config.c mod_mime.c mod_negotiation.c mod_status.c mod_include.c mod_dir.c mod_cgi.c mod_actions.c mod_speling.c mod_alias.c mod_access.c mod_setenvif.c
Slim, compact, and to the point.

Case Three: Graphics Server (graphics.content-heaven.com)

In addition to the general HTML-serving httpd, Content Heaven, Inc has decided to use a dedicated server specifically for serving their images and graphics. In this common scenario, only a few of Apache’s modules are needed, since the server will be doing one thing, and one thing only: sending files from disk over the network. Thus, we can disable many of the standard modules that we left untouched before, such as mod_access, mod_include, mod_index, and mod_cgi, in addition to the ones we had disabled earlier. Finally, let’s include mod_rewrite in the graphics server, for redirecting direct requests for graphics.content-heaven.com to www.content-heaven.com.

Our configure command looks like this:

# ./configure –with-layout=Apache –logfiledir=/logs/httpd –disable-module=imap –disable-module=asis –disable-module=userdir –disable-module=autoindex –disable-module=auth –disable-module=access –disable-module=include –disable-module=dir –disable-module=cgi –disable-module=env –disable-module=setenvif –disable-module=negotiation –enable-module=rewrite –verbose
Notice that is looks very similar to the previous example. Once it has completed compiling, our binary looks like this:

# /usr/local/apache/bin/httpd -l Compiled-in modules: http_core.c mod_log_config.c mod_mime.c mod_status.c mod_actions.c mod_alias.c mod_rewrite.c
This is even more slimmed down that the previous one, and contains only the exact modules we need.

{mospagebreak title=Apache Module Registry}

The Apache Module Registry is your key to a really cool customized web server. There are tons of modules to do things like authentication and parameter parsing, embedded languages like mod_perl, Java (with Apache JServ, allowing for embedded Java, and The Jakarta Project, a JSP implementation), mod_snake (for Python), and mod_tcl, and others.

{mospagebreak title=Last Thoughts} Building a customized Apache from source is easy, with a little planning. By using only the modules you need, you can build a web server that is fast, streamlined, and simple to maintain. The majority of building Apache is in the planning. Deciding which of the 3 dozen modules you need is where the difficulty lies. Hopefully, this article will push you to investigate the standard modules, and look into the additional modules that are available.

{mospagebreak title=References}
[gp-comments width="770" linklove="off" ]

antalya escort bayan antalya escort bayan Antalya escort diyarbakir escort