This article introduces those new to networking to Apache, the Hypertext Transfer Protocol (HTTP), and the basics of system administration. It is excerpted from chapter one of Peter Wainwright's book Pro Apache (Apress, 2004; ISBN: 1590593006).
IP addresses are made up of two parts: the network address on the left and the local host address to the right. The network classes A, B, and C correspond to networks with an exact number of octets, but you can use a netmask (sometimes called a subnet mask) to divide the network and local address at points of your choosing using binary arithmetic. This tells you whether two hosts are local to each other or on different networks. The netmask is a fundamental attribute of the network interface, just like the IP address. A server can use it to determine whether a destination IP address is local and can be contacted directly or must be reached indirectly through an intermediate router.
The netmask is a number that looks like an IP address but isn’t. It defines the binary bits from the network part of the address. It’s all 1s to the right of the dividing line between network and local host and all 0s to the left. A netmask with a 0 to the right of a 1 is invalid and illegal. To get the network address for an IP address, the net-mask is logically joined to it by an AND command—this gives you 220.127.116.11 for the network address in the class B example.
The netmask of an IP address is an added string that determines the network block—A, B, or C—a given address belongs in as well as the size of the address space of the host address. Normally, a netmask takes the form of, for example, belonging to a class C address space, 255.255.255.0. The octets with a value of 255 are the indicator of class placement. One octet of 255 followed by zeros indicate a class A, two octets of 255 indicate a class B, and of course, the example of three octets of 255 indicates a class C.
Simply put, the netmask does exactly what it sounds like it does: It masks the net or network. In the example just shown, the class C host is determined solely by its last octet value; therefore, the first three octets are network-related. Using this knowledge lets you create the netmask of 255.255.255.0, or N.N.N.H, where N is network and H is host.
If two IP addresses map to the same network address after being joined by an AND with the netmask, they’re on the same network; if not, they’re on different networks. IPv6 netmasks are no different from their IPv4 counterparts, just longer and less interesting to look at.
For example, note the three hosts with IP addresses shown in Table 1-5.
Table 1-5. Example Hosts on Different Networks
If you define a netmask of 255.255.255.0 for the network interfaces on each host, Host A and Host B will be assumed to be on the same network. If Host A sends a packet, TCP/IP will attempt to send it directly to Host B. However, Host B can’t send a packet to Host C directly because the netmask stipulates that 192.168.1 and 192.168.2 are different networks. Instead it’ll send the packet to a gateway. Each host is configured with the IP address of at least one gateway to send packets it can’t deliver itself.
If, however, you define a netmask of 255.255.0.0, all three hosts will be considered to be on the same network. In this case, Host A will be able to send to Host C directly, assuming they’re connected to the same physical network. When you experience routing problems on a network, a badly configured netmask is often the cause, particularly if Host A can connect to Host B, but not vice versa.
IP is responsible for ensuring that a packet addressed to a particular host gets delivered to that host. By dividing the address space into logical networks, the task of finding a host becomes much simpler—instead of having to know every host on the Internet, a host needs to know only a list of gateways and pick the one that’s the next logical step on the route. The identity of the next stop is then fed to the underlying protocol (for example, Ethernet) so that the packet is forwarded to it for onward delivery. In Ethernet’s case, the identity is the Ethernet address of the gateway on the local network. The gateway carries out the same procedure using its own list of gateways and so on until the packet reaches the final gateway and its destination.
NOTE For a practical example of netmasks in action, see the sample ifconfig output given later in the chapter. This shows two local addresses and an external Ethernet interface partitioned by a netmask to force them onto separate networks.
Web Services: Well-Known Ports
When a client contacts a server, it’s generally because the client wants to use a particular service—e-mail or File Transfer Protocol (FTP), for example. To differentiate between services, TCP implements the concept of ports, allowing a single network interface to provide many different services. When a client makes a network connection request to a server, it specifies not only the IP address of the server it wants to contact as required by IP, but also a port number.
By default, HTTP servers such as Apache server port 80, which is the standard port number for HTTP. When a connection request arrives for port 80, the operating system knows that Apache is watching that port and directs the communication to it. Each standard network service and protocol has an associated port that clients may connect to for that service, be it HTTP, FTP, telnet, or another service.
The standard list of port numbers is defined under Unix in a file called /etc/ services, which lists all the allocated port numbers. The corresponding file under Windows is called Services and is located in the installation directory of Windows C:\WINNT\system32\drivers\etc\. In fact, the operating system and the various daemons responsible for providing services already know what ports they use. Other applications use /etc/services to refer to a service by name instead of by number. /etc/services also specifies which protocol (TCP or UDP) a service uses; many services handle both TCP and UDP connections. The following is a short list of some of the most common port numbers, extracted from a typical /etc/services file:
# File Transfer Protocol
# Finger Daemon
# WorldWideWeb HTTP
# HyperText Transfer Protocol
# Post Office Protocol
# Version 2
# Post Office Protocol
# Version 3
# USENET News Transfer Protocol
# Network Time Protocol
# Interactive Mail Access
# Protocol V2
# Simple Net Management Protocol
# Interactive Mail Access
# Protocol V3
# Secure HTTP
# Secure HTTP
# Unix to Unix Copy
Of particular interest in this list is the HTTP port at 80 and the HTTPS port at 443. Note that both UDP and TCP connections are acceptable on these ports. How they’re handled when used on a given port depends on the program handling them. Just because a service is listed doesn’t mean that the server will respond to it. Indeed, there are plenty of good reasons not to respond to some services—telnet, FTP, SNMP, POP-3, and finger are all entirely unrelated to serving Web pages and can be used to weaken server security.
On Unix systems, port numbers below 1024 are reserved for system services and aren’t useable by programs run by nonprivileged users. For Apache to run on port 80, the standard HTTP port, it has to be started by root or at system startup by the operating system. Nonprivileged users can still run an Apache server as long as they configure Apache to use a port number of 1024 or higher. On Windows, no such security conditions exist.
Internet Daemon: The Networking Super Server
Not every service supplied by a host is handled by a constantly running daemon. Because that would be very wasteful of system resources, Unix runs many of its services through the Internet daemon (inetd), a super server that listens to many different ports and starts a program to deal with connections as it receives them.
One such service is FTP, which usually runs on port 21. Unlike Apache, which usually runs stand-alone and appears as several httpd processes, there’s no ftpd process running under normal conditions. However, inetd is looking at port 21, and when it receives a TCP connection request, it starts a copy of ftpd to handle the connection. Once started, ftpd negotiates its own private connection with the client, allowing inetd to get back to listening. Once the communication is over—in FTP’s case, when the file is transferred or aborted—the daemon exits.
Apache 1.3 has a configuration directive, ServerType, which allows it to run either as a stand-alone service or to be invoked by inetd such as FTP. In this configuration, there are no httpd processes running until inetd receives a connection request for port 80 (or on whatever port inetd has been configured to start Apache). inetd then runs httpd and gives it the incoming connection, allowing Apache to handle the request. Because a separate invocation of Apache is started for each individual client connection, and each invocation lasts only for as long as it takes to satisfy the request, this is a hideously inefficient way to run Apache—this is why almost all Apache configurations are stand-alone. Consequently, Apache 2 removes the option entirely.
inetd isn’t without its problems. As the central coordinating daemon for many lesser networking services, it’s one of the biggest sources of network security breaches. The daemon itself isn’t insecure, but it implements services such as telnet that are. As a result, many Web server administrators choose to disable it entirely because none of the services it manages are necessary for a Web server. More recent Unix distributions come with an improved daemon called xinetd that builds in additional security measures, but in most cases there are still no compelling reasons to enable it. See Chapter 10 for more information on this topic.
This article is excerpted from Pro Apache by Peter Wainwright (Apress, 2004; ISBN 1590593006). Check it out at your favorite bookstore today. Buy this book now.