Apache
  Home arrow Apache arrow Page 3 - Getting Started with Apache 2.0 Part II
Dev Shed Forums  
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Smartphone Development  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Mobile Linux  
App Generation ROI  
IBM® developerWorks  
Forums Sitemap  
E-Commerce Hosting  
Linux Web Hosting  
Managed Hosting  
Small Business Hosting  
VPS Hosting  
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid  
Request Media Kit
Contact Us  
Site Map  
Privacy Policy  
Support  
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
Google.com  
APACHE

Getting Started with Apache 2.0 Part II
By: Harish Kamath
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: starstarstarstarstar / 15
    2005-03-21


    Table of Contents:
  • Getting Started with Apache 2.0 Part II
  • The Apache Log Files
  • Who Are You?
  • One Server, One Hundred Websites

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      error-file:tidyout.log Del.ici.ous error-file:tidyout.log Digg
      error-file:tidyout.log Blink error-file:tidyout.log Simpy
      error-file:tidyout.log Google error-file:tidyout.log Spurl
      error-file:tidyout.log Y! MyWeb error-file:tidyout.log Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article

     
     
    ADVERTISEMENT


    Getting Started with Apache 2.0 Part II - Who Are You?
    ( Page 3 of 4 )

    In the previous section, I showed you how to configure the error log file generated by the Apache Web server. While this helps a developer during the development and maintenance phases of a project, it may not be very useful to a Web master. The latter wants to analyze the traffic and the nature of visitors that visit his website, not errors. Fortunately, for those requirements, he has the Apache "access" log file.

    Now it's time to review the directives that govern these "access" log file(s):

    CustomLog logs/access_log common

    The syntax of the "CustomLog" directive is similar to that of the "ErrorLog" directive: you have to specify the name and the path (absolute or relative) of the log file. The only difference is the presence of the "common" keyword at the end of the line - this represents the "nickname" for the log entry format that you would like to use.

    Yes, you can define your own custom format (as well as "nicknames") using the "LogFormat" directive. There are four pre-defined formats listed in the default configuration file:

    LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
    LogFormat "%h %l %u %t \"%r\" %>s %b" common
    LogFormat "%{Referer}i -> %U" referer
    LogFormat "%{User-agent}i" agent

    In general, the syntax of the "LogFormat" directive looks something like this:

    LogFormat LOG_FILE_ENTRY_FORMAT  NICKNAME

    While the syntax describing the format appears complex at first glance, it will make sense once you understand what each symbol in the format string stands for. Consider the following "LogFormat" entry:

    LogFormat "%h %l %u %t \"%r\" %>s %b" common

    The keyword "common" represents its nickname and is used in the "CustomLog" directive to refer to this format; we've already seen that above.

    Now, let me concentrate the format string itself, where each symbol has a very specific purpose. In order to make things easier, let me paste a sample entry from the my local "access_log" file:

    127.0.0.1 - root [18/Dec/2003:12:52:43 +0530] "GET /phpmyadmin/db_details_structure.php?lang=en-iso-8859-1& server=1&db=industry HTTP/1.1" 200 28406
     

    Next, let me map each symbol from the format string to the actual entry in the above log file snippet:

    • The "%h" (value in log file: 127.0.0.1 ) represents the IP address of the client machine (in most cases).

    • The "%l" (value in log file: - ) is replaced by the RFC 1413 identity of the client. However, the official documentation states that this value is "highly unreliable and should almost never be used except on tightly controlled internal networks." Note that a "hyphen," i.e. the "-" symbol is used by the Web server to indicate that it could not retrieve a value for a particular parameter.

    • The "%u" (value in log file: root) indicates the username of the user accessing the Web server using HTTP  authentication. Often, this value is not recorded as most visitors are anonymous to the Web server.

    • The "%t" (value in log file: 18/Dec/2003:12:52:43 +0530) represents that date and time of the request. Note that you can customize the format of the time stamp using the syntax use for the strftime() C function.

    • The "%r" (value in log file: GET /phpmyadmin/db_details_structure.php?lang=en-iso-8859-1&server=1&db=industry HTTP/1.1) is replaced by the actual request URL sent by the client machine. Along with the request method (GET or POST), the entry also lists all the parameters sent in the query string, as seen above, for a GET request.

    • The "%>s" (value in log file: 200) represents the HTTP status code and is very useful to programmers and Web masters. Some of the common values listed in this column are 200 (indicating a successful response), 404 (indicating that the requested file was not found) and 500 (indicating an error occurred during the execution of the requested script).

      You can view a list of all HTTP status code at the following URL: http://www.w3.org/Protocols/rfc2616/rfc2616.txt 

    • The "%b" symbol represents the size of the response in bytes; this gives an indication of the bandwidth used by the website.

    Note that if you wish to insert quotes in the log files, you have to escape them in the log format string. For example, the \"%r\" syntax encloses the request URL within quotes in the log file.

    And that’s not all - there are many more symbols that you can use in your format string. Here are some important ones:

    • The "%A" symbol will display the local IP address.

    • The "%B" will display the number of bytes sent to the client, excluding the size of the HTTP headers. This is useful if you want to get an accurate picture of the bandwidth used by the different elements of your website such as images, style sheets, and so forth.

    • The "%{VARNAME}e" will list the contents of the "VARNAME " environment variable.

    • The "%f" symbol will display the filename requested by the client.

    • The "%H" will indicate the request protocol.

    • The "%m" symbol will represent the request method.

    • The "%T" will be replaced by the actual time taken by the server to respond to the request.

    Finally, there are two more important symbols that I would like to highlight before I move to the next directive:

    • The "%{User-agent}i" symbol is used to store the details of the client accessing the website. This can be used to identify the different browsers that you should support on the basis of the visitors accessing the website.

    • The "%{Referer}i" symbol is used to store the details of the resource that referred the visitor to the current page. Once again, this is an ideal mechanism to study how visitors are redirected to your website.

    Finally, there is one more directive that deserves a mention -the "HostnameLookups" directive informs the Web server to attempt to map the IP address of the client to its host name.

    HostnameLookups Off

    By default, this directive is turned "Off." However, if you turn "On" this directive, the log file should contain human-readable domain names (such as "http://www.kcsonline.biz") instead of the machine-friendly IP addresses (such as "69.44.155.211"). There is one caveat that you should keep in mind: the Web server has to make an additional request for every request in order to obtain the hostname, which in turn could slow down the logging process, thereby severely affecting performance.

    Before I conclude this section, let me give you a little note on the analysis of the Apache log files: leveraging the popularity of this Open Source Web server, there are a multitude of products that help you to analyze the log files generated by Apache. At one end of the spectrum, you have HTTP-Analyze (http://www.http-analyze.org/) available for free to personal users, and at the other end you have sophisticated (read expensive) tools such as Web Trends (http://www.webtrends.com/). The choice is yours!



     
     
    >>> More Apache Articles          >>> More By Harish Kamath
     

       

    APACHE ARTICLES

    - Creating a VAMP (Vista, Apache, MySQL, PHP) ...
    - Putting Apache in Jail
    - Containing Intrusions in Apache
    - Server Limits for Apache Security
    - Setting Permissions in Apache
    - Installing Apache
    - Apache Installation and Configuration
    - Apache Tapestry and Custom Components: DateI...
    - Tapestry and AJAX: Autocompleter and InlineE...
    - PropertySelection and IPropertySelectionMode...
    - The DatePicker and Shell Components of Apach...
    - Apache Tapestry: ASO and More Components
    - Apache Tapestry and DirectLink, IoC and DI
    - Making a CelebrityCollector with Apache Tape...
    - Apache Tapestry and Listener Methods, Condit...





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 3 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek