This article introduces those new to networking to Apache, the Hypertext Transfer Protocol (HTTP), and the basics of system administration. It is excerpted from chapter one of Peter Wainwright's book Pro Apache (Apress, 2004; ISBN: 1590593006).
Apache doesn’t run like a user application such as a word processor. Instead, it runs behind the scenes, providing services for other applications that communicate with it, such as a Web browser.
NOTE In Unix terminology, applications that provide services rather than directly communicate with users are called daemons .Apache runs on Windows NT, where the same concept is known as a service .Windows 95/98 and Windows ME aren’t capable of running Apache as a service; it must be run from the command line (the MS-DOS prompt or the Start menu’s Run command), even though Apache doesn’t interact with the user once it’s running.
Apache is designed to work over a network, so Apache and the applications that talk to it don’t have to be on the same computer. These applications are generically known as clients. Of course, a network can be defined as anything from a local intranet to the whole Internet, depending on the server’s purpose and target audience. I’ll cover networks in more detail later in this chapter.
The most common kind of client is of course a Web browser; most of the time when I say client, I mean browser. However, there are several important clients that aren’t browsers. The most important are Web robots and crawlers that index Web sites, but don’t forget streaming media players, news ticker applications, and other desktop tools that query Internet servers for information. Web proxies are also a kind of client because they forward requests for other clients.
The main task of a Web server is to translate a request into a response suitable for the circumstances at the time. When the client opens communication with Apache, it sends Apache a request for a resource. Apache either provides that resource or provides an alternative response to explain why the request couldn’t be fulfilled. In many cases, the resource is a Hypertext Markup Language (HTML) Web page residing on a local disk, but this is only the simplest option. It can be many other things, too—an image file, the result of a script that generates HTML output, a Java applet that’s downloaded and run by the client, and so on.
Apache uses HTTP to talk with clients. It’s a request/response protocol, which means that it defines how clients make requests and how servers respond to them: Every HTTP communication starts with a request and ends with a response. The Apache executable takes its name from the protocol, and on Unix systems is generally called httpd, short for HTTP daemon. I’ll discuss the basics of HTTP later in this chapter; the details are, more or less, the rest of the book.
Running Apache: Unix vs. Windows
Apache was originally written to run on Unix servers, and today it’s most commonly found on Linux, BSD derivatives, Solaris, and other Unix platforms. Since Apache was ported to Windows 95 and NT, it has made substantial inroads against the established servers from Microsoft and other commercial vendors—a remarkable achievement given the marketing power of those companies in the traditionally proprietary world of Windows applications.
Because of its Unix origins, Apache 1.3 was never quite as good on Windows as it was on Unix, but with Apache 2, programmers have completely redesigned the core of the Apache server. One major change is the abstraction of platform-specific implementation details into the Apache Portable Runtime (APR), and the server’s core processing logic has been moved into a separate module, known as a Multi Processing Module (MPM). As a result, Apache runs faster and more reliably on Windows because of an MPM dedicated to those platforms. NetWare, BeOS, and OS/2 also benefit from an MPM tuned to their platform-specific needs.
Apache runs differently on Unix systems than on Windows. When you start Apache 1.3 on Unix, it creates (or forks) several new child processes to handle Web server requests. Each new process created this way is a complete copy of the original Apache process. Apache 2 provides this behavior in the prefork MPM, which is designed to provide Apache 1.3 compatibility.
Windows doesn’t have anything resembling the fork system call, so Apache was extensively rewritten to use the native Windows threads. Theoretically, this is a much more efficient and lightweight solution because threads can share resources (thereby reducing their memory requirements). It also allows more intelligent switching between tasks by the operating system. However, Apache 1.3 used the Windows POSIX emulation layer (a Unix compatibility standard) to implement threads, which meant that it never ran as well as it theoretically would have. Apache 2 uses native Windows threads directly, courtesy of the APR, and accordingly runs much more smoothly.
Thread support in Apache 2 for the Unix platform is found in the worker, leader, threadpool, and perchild MPMs, which provide different processing models depending on your requirements. The new architecture coupled with the benefits of threaded programming provide a welcome boost in performance and also reduce the differences between Windows and Unix, thus simplifying future development work on both platforms.
NOTE Apache is more stable on Windows NT, 2000, and XP than on Windows 9x and ME because the implementation of threads is cleaner on the former. To run Apache on Windows with any degree of reliability, choose an NT-derived platform because it allows Apache to run as a system service.
However, if reliability and security are a real concern, you should consider only a Unix server for the sake of both Apache’s and the server. Additionally, new versions of Apache stabilize much faster on Unix than Windows, so choose Unix to take advantage of improvements to the server as soon as possible.
Apache is capable of running on many operating systems, in most cases straight from an installed binary distribution—notably OS/2, 680x0, PowerPC-based Macs (both pre–MacOS X and post–MacOS X), BeOS, and NetWare. MacOS X is remarkable in that it’s almost entirely unremarkable; it’s only a Unix variant. Apache 2 provides MPMs for Unix, Windows, OS/2, BeOS, and NetWare as standard, all of which I’ll cover in Chapter 9. Other MPMs are also in development. I won’t cover these additional MPMs in depth, but you can find more information on the ASF Web site at http://www.apache.org/.
This article is excerpted from Pro Apache by Peter Wainwright (Apress, 2004; ISBN 1590593006). Check it out at your favorite bookstore today. Buy this book now.