HTTP is the underlying protocol that all Web servers and clients use. Whereas HTML defines the way that Web pages are described, HTTP is concerned with how clients request information and how servers respond to them.
HTTP usually works beneath the surface, but a basic understanding of how HTTP works can be useful to the Web server administrator when diagnosing problems and dealing with security issues. This information is also useful because many of Apacheís features are HTTP-related as well.
The HTTP/1.1 protocol is defined in detail in RFC 2616, which can be accessed in text form at http://www.w3.org/Protocols/rfc2616/rfc2616.txt. Although this is a technical document, itís both shorter and much more readable than might be expected. Administrators are encouraged to at least glance at it, and those who expect to use Apacheís more advanced features will want to keep a printed copy handy. Portable Document Format (PDF) versions are also available.
HTTP is a request/response stateless protocol, which means that the dialogue between a Web client (which may or may not be a browser) and server consists of a request from the client, a response from the server, and any necessary intermediate processing. After the response, the communication stops until another request is received. The server doesnít anticipate further communication after the immediate request is complete, unlike other types of protocols that maintain a waiting state after the end of a request.HTTP Requests and Responses
The first line of an HTTP request consists of a method that describes what the client wants to do, a Uniform Resource Identifier (URI) indicating the resource to be retrieved or manipulated, and an HTTP version. This is followed by a number of headers that modify the request in various ways, for example, to make it conditional on certain criteria or to specify a hostname (required in HTTP/1.1). On receipt of the request and any accompanying headers, the server determines a course of action and responds to the request. A typical request for an HTML document might be this:
TIP Using the telnet command or a similar command line connection utility, you can connect to a running server and type the request in by hand to see the request and response directly. For example, type telnet localhost 80 and then press Enter twice to send the request after typing both lines. See Chapter 2 for more about using telnet.
Successful requests return a status code of 200 and the requested information, prefixed by the serverís response headers. A typical set of response headers for an Apache server looks something like this:
The status line, which contains the protocol type and success code, appears first, followed by the date and some information about the server. Next are the rest of the response headers, which vary according to the server and request. The most important is the Content-Type header, which tells the client what to do with the response. The Content-Length header lets the client know how long the body of the response is. The Date, ETag, and Last-Modified headers are used in caching.
If an error occurs, an error code and reason are returned on the status line:
Itís also possible for the server to return a number of other codes in certain circumstances, for example, redirection.
Understanding HTTP Methods
Methods tell the server what kind of request is being made. The examples shown in Table 1-1 are truncated to illustrate the nature of the request and response. A real Apache server will likely send far more headers than these, as illustrated by the sample responses.
In HTTP/1.1, the methods shown in Table 1-2 are also supported.
Table 1-1. Basic HTTP Methods
GET Get a header and resource from the server.
A blank line separates the header and resource.
HEAD Return the header that would be returned by a GET method, but donít return the resource itself.
Note that the content length is returned even though thereís no content.
POST Send information to the server. The serverís response can contain confirmation that the information was received.
The server must be configured to respond appropriately to a POST, for example, with a CGI script.
GET /index.html HTTP/1.0
HEAD /index.html HTTP/1.0
POST /cgi-bin/search.cgi HTTP/1.0Content-Length: 46query=alpha+complex&casesens=false&cmd=submit
HTTP/1.1 200 OK Date: Mon, 28 Jul 2003 17:02:08 GMT Server: Apache/2.0.46 (Unix) Content-Length: 177 6 Content-Type: text/html; charset=ISO-8859-1 Connection: close
<!DOCUTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html> ...
HTTP/1.1 200 OK Date: Mon, 28 Jul 2003 17:01:13 GMT Server: Apache/2.0.46 (Unix) Content-Length: 177 6 Content-Type: text/html; charset=ISO-8859-1 Connection: close
HTTP/1.1 201 CREATED Date: Mon, 28 Jul 2003 17:02:20 GMT Server: Apache/2.0.46 (Unix) Content-Type: text/html; charset=ISO-8859-1 Connection: close
<!DOCUTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML> ... </HTML>
Table 1-2. Additional HTTP Methods
OPTIONS Return the list of methods allowed by the server.
This is of particular relevance to WebDAV servers, which support additional methods defined in RFC 2518 .
TRACE Trace a request to see what the server actually sees.
This displays what the request looks like after it has passed through any intermediate proxies. It may also be directed at an intermediate proxy by the Max-Request header to discover information about intermediate servers.
For more information on TRACE, see RFC 2616.
DELETE Delete a resource on the server.
In general, the server should not allow DELETEmethods, so attempting to use it should produce a response like that given in the example. The exception isWebDAV servers, which do implement DELETE.
OPTIONS * HTTP/1.1Host: www.alpha-complex.com
TRACE * HTTP/1.1Host: www.alpha-complex.com
DELETE /document.html HTTP/1.1Host: www.alpha-complex.com
HTTP/1.1 200 OK Date: Mon, 28 Jul 2003 16:54:55 GMT Server: Apache/2.0.46 (Unix) Allow: GET, HEAD, POST, OPTIONS, TRACE Content-Length: 0 Content-Type: text/plain; charset=ISO-8859-1
HTTP/1.1 200 OK Date: Mon, 28 Jul 2003 17:09:18 GMT Server: Apache/2.0.46 (Unix) Content-Type: message/http; charset=ISO-8859-1
TRACE * HTTP/1.1 Host: www.alpha-complex.com
HTTP/1.1 405 Method Not Allowed Date: Mon, 28 Jul 2003 17:24:37 GMT Server: Apache/2.0.46 (Unix) DAV/2 Allow: GET, HEAD, OPTIONS, TRACE Content-Type: text/html; charset=ISO-8859-1
<!DOCUTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>405 Method Not Allowed</TITLE> </HEAD><BODY> <H1>Method Not Allowed</H1> <P>The requested method DELETE is not allowed for the URL /document.html.</P> </BODY></HTML>
HTTP/1.1 201 CREATEDDate: Mon, 28 Jul 2003 17:30:12 GMTServer: Apache/2.0.46 (Unix) DAV/2Content-Type: text/html; charset=ISO-8859-1<!DOCUTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><html>...</HTML>
PUT /newfile.txt HTTP/1.1Host: www.alpha-complex.comContent-Type: text/plainContent-Length: 63
This is the contents of a file we want to create on the server
PUT Create or change a file on the server.
In general, the server should not allow PUT methods because POST is generally used instead. PUT implies a direct relationship between the URI in the PUT request and the same URI in a subsequent GET, but this is notimplied by POST. Again, WebDAV servers may implement PUT.
CONNECT Enable proxies to switch to a tunneling mode for protocols like SSL.
See the AllowCONNECT directive in Chapter 8 for more details.
A URI is a textual string that identifies a resource, either by name, by location, or by any other format that can be understood by the server. URIs are defined in RFC 2396.
The URI is usually a conventional Uniform Resource Locator (URL) as understood by a browser, of which the simplest possible form is the forward slash (/). Any valid URI on the server can be specified here, for example:
If the method doesnít require a specific resource to be accessed, the asterisk (*) URI can be used. The OPTIONS example in Table 1-2 just shown uses the asterisk. Note that for these cases, itís not incorrect to use a valid URI, just redundant.
blog comments powered by Disqus