Logging in Apache

Administrators need keep regular tabs on their Web servers to make they are running smoothly, so that their clients don’t meet with any unpleasant surprises. Logging helps you to spot performance problems before they become an issue, and also assists in the detection of possible security concerns. This article will discuss configuring Apache for logging purposes, and will go into some detail about remote logging solutions. It is excerpted from Hardening Apache by Tony Mobily (Apress, 2004; ISBN: 1590593782).

mobilyKEEPING AN EYE ON WEB SERVERS so that they run smoothly is an essential part of the administrator’s job, for both performance and reliability purposes. Sensible logging helps to detect performance problems well before they become apparent to users, and provides evidence of potential security problems.

In this chapter I will first explain how to configure Apache for logging purposes, highlighting the most common problems. I will then introduce remote logging using syslogd, the standard logging server that comes with Unix. Finally, I will propose a remote logging solution, which will allow you to encrypt logging information and store it on a remote database using MySQL.

Why Logging?

Log files show you what your daemons are doing. From a security perspective, Apache’s log files are used for:

  • Logging requests made and pages served in order to identify “suspicious” requests.

  • Logging Apache’s extra information, such as errors and warnings. This information is very interesting, because an attack generally creates some abnormal entries.

The importance of log files is often underestimated. Sometimes, even in important production servers, they are left there to grow and grow, until one day they make themselves noticed because they have filled up the file system.

NOTE Log files should never be placed on file systems that don’t support adequate logging. Typically that means NFS, but it might also mean Samba, AFS, and others. You must either log to a remote application or to a local file system.

Configuring Logging in Apache

I will give an overview of how to configure log files in Apache. Remember that this is not a comprehensive explanation, and for more information you should look at Apache’s official documentation: http://httpd.apache.org/docs/logs.html.

Normal (Classic) Configuration

There are two types of log information in Apache: the access log (handled by the module mod_log_config) and the error log.

The access log records every request sent to the web server. A typical configuration is:

LogFormat “%h %l %u %t “%r” %<s %b” common
CustomLog logs/access_log common

NOTE A better term for access log would be activity log, because you can use the powerful Apache directives to potentially create log files that just log unique ids or user-agents. However, in this book I will use the more common term access log.

Here, LogFormat sets a log format and assigns a name to it, common in this case. Then, Apache is instructed to log access information in the file logs/access_log, using the format defined in the previous line (common). To find out the exact meaning of each parameter, check Apache’s documentation. You will find out that Apache can log almost anything pertaining to a request, including the client’s address and the type of request itself. The log file format just described is the most common for HTTP requests (for example, IIS is capable of generating the same result), hence its name.

Apache server’s error messages are logged separately, using a different file. In this case, there is no definite format for the messages, and these directives are defined:

ErrorLog logs/error_log
LogLevel warn

The first directive, ErrorLog, instructs Apache to log all the errors in logs/ error_log. The second directive sets the minimum importance for a message to be logged (the “level” of the message). These levels are defined in Table 3-1.

LOGLEVEL SIGNIFICANCE OF ERROR
emerg System is unstable
alert Immediate action required
crit Critical error
error Non-critical error
warn Warning
notice Normal but significant
info Informational
debug Debug level

Table 3-1. Apache Error Levels

Remember that if you decide to set the log level to crit, the messages for more important levels will be logged as well (in this case, alert and emerg).

NOTE Notice level messages are always logged, regardless of the LogLevel setting.

Delegating Logging to an External Program

Sometimes, it is advisable to delegate all the logging to specifically developed parsing engines or archiving utilities. When Apache is started, it runs the logging program and sends all the logging messages to it. This solution is valid in many situations. For example:

  • When you don’t want to stop and restart your Apache server to compress your logs.

  • When you have many virtual hosts. If you use a different log file for each virtual host, Apache will need to open two file descriptors for every virtual domain, wasting some of the kernel’s and processes’ resources.

  • When you want to centralize your logging into one single host. The program specified in the configuration could send the log lines elsewhere instead of storing them locally, or for increased reliability, it could do both.
  • When you want to create a special log filter that watches every log request looking for possible security problems.

There are some disadvantages to using an external program. For example, if the program is too complex, it might consume too much CPU time and memory. In addition, if the external program has a small memory leak, it might eventually chew up all the system’s memory. Finally, if the logging program blocks, there is a chance of causing a denial of service on the server.

To delegate logging to an external program ( piped logging), you can use the following syntax:

CustomLog “|/usr/local/apache2/bin/rotatelogs
/var/log/access_log 86400″ common

The command /usr/local/apache2/bin/rotatelogs /var/log/access_log 86400 is run by apache at startup time.

In this case, the program rotatelogs will be fed the log lines by Apache, and will write them on /var/log/access_log. Remember that you can use the same syntax for Apache’s error log using the directive ErrorLog. For more information about how CustomLog and ErrorLog work, refer to Apache’s official documentation.

{mospagebreak title=Security Issues of Log Files}

Logging appears to be a simple process, and you might wonder why security is involved at all. There are some very basic security problems connected to logging. For example:

  • Logs are written as root, and permission problems can be dangerous.

  • Logs are written in plain text, and can be easily modified and forged.

  • Logging programs are executed as root; if they are vulnerable, an attacker may gain root access.

  • Logs can cause a DOS if they run out of disk space (an attacker might do this deliberately).

  • Logging can be unreliable; if Apache dies (for example after an attack), they could be incomplete.

I will discuss each of these problems in the following sections.

Logs and Root Permissions

Apache is normally started by the root user, in order to be able to listen to port 80 (non-root processes can only listen to ports higher than 1024). After starting up, Apache opens the log files, and only then drops its privileges. This allows the Apache server to write to files that no other user may access (if the permissions are set properly), protecting the log files. If the log files were opened after dropping privileges, they would be a lot more vulnerable.

This implies that if the directory where the logs are stored is writable by common users, then an attacker can do this (note the wrong permissions for the logs directory):

[merc@localhost merc]$ cd /usr/local/apache2/
[merc@localhost apache2]$
ls -l
total 52
drwxr-xr-x 2 root root 4096 Oct 4  14:50 bin
drwxr-xr-x 2 root root 4096 Sep 13 23:18 build
drwxr-xrwx 2 root root 4096 Oct 5  18:10
logs
[...]
drwxr-xr-x 2 root root 4096 Oct 4  18:50 modules
[merc@localhost apache2]$
cd logs
[merc@localhost logs]$
ls -l
total 212
-rw-r–r–1 root root 124235 Oct 5 18:11 access_log
-rw-r–r–1 root root 74883  Oct 5 18:10 error_log
-rw-r–r–1 root root 5      Oct 5 18:10 httpd.pid
[merc@localhost logs]$
rm access_log
rm: remove write-protected file ‘access_log’?
y
[merc@localhost logs]$
ln -s /etc/passwd_for_example 
                       access_log
[merc@localhost logs]$
ls -l
total 84
lrwxrwxrwx 1 merc merc 23   Oct 5 19:26 access_log ->
/etc/passwd_for_example
-rw-r–r–1 root root 75335 Oct 5 19:27 error_log
-rw-r–r–1 root root 5     Oct 5 19:27 httpd.pid
[merc@localhost logs]$

Obviously, this can only be done if the attacker has login access to the web server. The next time Apache is run, the web server will append to /etc/passwd. This would make the system unstable and prevent any further login by users. The solution is to ensure that the logs directory is not writable by other users.

Logs As Modifiable Text Files

Log files are usually stored as text files, and they are therefore very easy to:

  • Forge. A cracker might want hide any trace of his or her attack, and might therefore edit out those lines that would highlight the attacks.

  • Delete. Logs might be quite valuable for your company for access-analyzing purposes, and missing information might represent a problem—and a loss of money.

  • Steal. This wouldn’t happen very often, but it’s a possibility, especially if your logs have any value for data mining.

A possible solution to this problem is to protect the logs (and your system) properly so that these things can’t happen. Another solution is remote logging, discussed in the second part of this chapter.

Logging Programs and Security

Because the logger program is run as root, it must be kept simple, and the code must be audited for vulnerabilities like buffer overflows. In addition, the directory where the program resides must be owned by root, and non-root users must not have write permissions. Otherwise, they could delete the logging program and replace it with a malicious one.

Logs and Disk Space

Because Apache logs can be big, you need to monitor their size. For instance, a cracker might send many requests, with the sole purpose of filling up the disk space, and then perform an attack (buffer overflow, for instance). If Apache’s logs and other system logs share the same partition, the cracker will be able to perform any kind of buffer overflow attack without being logged.

Remember that all the system logs should be directed to a partition that will not cause system-wide interruptions if it fills up, such as /var. Further, the log files should be compressed once they are archived to save disk space. In addition, you can use a script that periodically checks the size of the log directory and issues a warning if too much disk space is being used, or if the log partition is full. The script could be as simple as this:

#!/bin/bash

partition=”/dev/hda5″

free_space=`df | grep $partition | cut -b 41-50`
echo $free_space

if [ $free_space -le 5000 ];then
       message=”WARNING: Only $free_space blocks left on
                $partition”;
       logger -p local0.crit $message 
       echo $message | mail -s “Partition problem”
                               
merc@localhost
fi

This script assumes that the log partition is /dev/hda5. The available free space is taken out of the df command using cut (free_space=`df | grep $partition | cut -b 41-50). If there are fewer than 5000 blocks left (if [ $free_space -le 5000 ];then), the script logs the problem using logger, and sends an e-mail to merc@localhost. This script should be placed in your crontab, and should be executed every 15 minutes. This script can easily be improved so that it stores the available space each time in /tmp/free_space, and warns you if there has been a drastic change in the available space on the log partition. See Chapter 7 for more scripts aimed at automated system administration tasks.

Unreliable Logging

After a DOS attack, the server’s accountability (the logs) may be compromised; therefore, Apache might not be able to write the entries about the attack on the log files. This means that if an attacker performed a DOS attack against your server, you might not be able to investigate the attack.

{mospagebreak title=Reading the Log Files}

There is one issue that seems to be overlooked by many system administrators: it is important to read and analyze log files. As I mentioned earlier, Apache has two types of logs: the error log and the access log.

The Error Log

An ideal error log on a running server is an empty one (apart from information about the server starting and stopping) , when the error level is set to notice. For example, a “File not Found” error probably means that there is a broken link somewhere on the Internet pointing to your web site. In this case, you would see a log entry like this:

[Sat Oct 05 20:05:28 2003] [error] [client 127.0.0.1] File
does not exist: /home/merc/public_html/b.html, referer:    
http://localhost/~merc/a.html

The webmaster of the referrer site should be advised that there is a broken link on their site. If there is no answer, you might want to configure your Apache server so that the broken link is redirected to the right page (or, if in doubt, to your home page).

If crackers are looking for possible exploits, they will generate “File not Found” entries in the error log, so keeping the error log as clean as possible will help to locate malicious requests more easily. Some exploit attempts are logged in the error_log. For instance, you could find:

[merc@localhost httpd]$ grep -i formmail access_log
[Sun Sep 29 06:16:00 2003] [error] [client 66.50.34.7]
script not found or unable to stat: /extra/httpd/cgi-bin/formmail.pl
[merc@localhost httpd]$

The formmail script is widely used, but it generates a number of security issues.

A segmentation fault problem needs attention as well. Apache should never die, unless there is a problem in one of the modules or an attack has been performed against the server. Here is an example:

[Sun Sep 29 06:16:00 2002] [error] [notice] child pid 1772
exit signal Segmentation fault (11)

If you see such a line in the log file, you will have to see what was going on at the time in the server’s activity (possibly reading the access_log file as well) and consider upgrading Apache and its modules as soon as possible. Because of Apache’s extensive use and deployment, most such problems in the core Apache package have been eliminated. Therefore, a segmentation fault message usually implicates an after-market or third-party module failure, or a successful DOS attack.

The Access Log

The access log includes information about what the user requested. If the error log reports a segmentation fault, you can use the access log to find out what caused Apache to die. Remember that if the cause of death is really sudden, because of buffering issues, the latest log information might not be in the log file.

You can also use the access log to check whether someone is trying to break into your system. Some attacks are easy to identify by checking for the right string in the log. You can find the entries for many Windows-aimed attacks just by looking for the exe string in the access log. For example:

[root@localhost logs]# grep -i exe access_log
200.216.141.59 – – [29/Sep/2003:06:25:22 +0200] “GET /_vti_bin/shtml.exe HTTP/1.0″ 404 288
200.216.141.59 – – [29/Sep/2003:06:31:33 +0200] “GET /_vti_bin/shtml.exe HTTP/1.0″ 404 288
193.253.252.93 – – [02/Oct/2003:02:17:53 +0200] “GET /scripts/..%c0%af../winnt/system32/cmd.exe?/c+dir+c: HTTP/1.1″ 404 319
151.4.241.194 – –  [02/Oct/2002:02:34:46 +0200] “GET /scripts/..%255c%255c../winnt/system32/cmd.exe?/c+dir” 404 -
[root@localhost logs]#

The main problem with using grep to look for attacks: URLs can be URL-encoded (see Appendix B for more information). This means that the last entry you saw in the access_log shown above could be written as:

151.4.241.194 – – [02/Oct/2003:02:34:46 +0200] “GET
/scripts/..%255c%255c../winnt/system32/cmd.%65x%65?/c+dir” 404 -

When encoded in this way, this URL would escape the grep filter. To perform an effective search, you will need to URL-decode all the URLs from your log files (you can use Perl for this), and then compare them with the suspicious ones. This simple script (Listing 3-1) should come in handy.

Listing 3-1. A Simple Script to Use As a Filter

#!/usr/bin/perl
use URI::Escape;
use strict;

# Declare some variables
#
my($space)=”%20″;
my($str,$result);

# The cycle that reads
# the standard input
while(<>){

   # The URL is split, so that you have the
   # actual PATH and the query string in two
   # different variables. If you have
   #
http://www.site.com/cgi-bin/go.pl?query=this,
   # $path =
http://www.site.com/cgi-bin/go.pl
   # $qstring = “query=this”
   my ($path, $qstring) = split(/?/, $_, 2);

   # If there is no query string, the result string
   # will be the path…
   $result = $path;

   # …BUT! If the query string is not empty, it needs
   # some processing so that the “+” becomes “%20″!
   if($qstring ne “”){
           $qstring =~ s/+/$space/ego;
           $result .= “?$qstring”;
   }

   # The string is finally unescaped…
   $str = uri_unescape($result);

   # …and printed!
   print($str);
}

Note that the script is slightly complicated by the fact that a + (plus) in the query string (and only in the query string) must be converted into %20 ($qstring =~ s/+/$space/ego;), which is then translated into a space once the string is URL-decoded ($str = uri_unescape($result);).

You should call this script urldecode, place it in /usr/local/bin, and give it executable permission (chmod 755 /usr/local/bin/urldecode). To test it, just run it:

[root@localhost logs]# urldecode
hello
hello
this is a test: .%65x%65
this is a test: .exe
[root@localhost logs]#

The script acts as a filter as it echoes information to the standard output. The command to test your logs should now be:

[root@merc root]# cat access_log | urldecode | grep exe

You can change exe into anything you want to look for in your log.

{mospagebreak title=Remote Logging}

In some cases, you’ll want to store your logs on a separate, secure server on your network dedicated to logging. This means that your server won’t be responsible for holding the logs, and if some crackers gain access to it, they won’t be able to delete their tracks (unless they crack the log server as well).

There are two ways of doing this. The first one is to instruct Apache to send all the log messages to the standard Unix log server, syslogd. The second solution is to build a custom-made logger script that sends the log entries to a remote server. You can implement this in several ways, and it might prove to be better for security and simplicity.

In the following sections I will explain how syslogd works, how to configure Apache and syslogd so that logs are stored on a remote log server, and how to store log files remotely without using syslogd and encrypting the log information.

Logging in Unix

Logging is a critical task. On a machine that acts as a server, there might be several daemons that log important messages continuously. Unix has logging facilities that make this completely transparent, and so Unix programs don’t have to worry about where or how their messages are logged, or know about all the problems concerning locking or integrity of log files. They can use the ready-to-use functions that abstract the whole logging mechanism using the syslogd logging daemon.

syslogd at Work

The syslogd daemon runs in the background and waits for new messages coming from either /dev/log (a UNIX domain socket) or the 514 UDP port. For security reasons, syslogd will not listen to the 514 UDP port by default. This means it will only work locally, and not by remote (otherwise, everyone on the Internet could log information on your server).

A log message is a line of text, but it has two important attributes: the facility (to specify the type of program that is logging the message), and the log level (which specifies how urgent the message is).

The facility can be (from man 3 syslog):

  • LOG_AUTH: security/authorization messages (DEPRECATED Use LOG_AUTHPRIV instead)

  • LOG_AUTHPRIV: security/authorization messages (private)

  • LOG_CRON: clock daemon (cron and at)

  • LOG_DAEMON: system daemons without separate facility value

  • LOG_FTP: ftp daemon

  • LOG_KERN: kernel messages

  • LOG_LOCAL0 through LOG_LOCAL7: reserved for local use

  • LOG_LPR: line printer subsystem

  • LOG_MAIL: mail subsystem

  • LOG_NEWS: USENET news subsystem

  • LOG_SYSLOG: messages generated internally by syslogd

  • LOG_USER (default): generic user-level messages
  • LOG_UUCP: UUCP subsystem

    The log level can be (from man 3 syslog):

  • LOG_EMERG: system is unusable

  • LOG_ALERT: action must be taken immediately

  • LOG_CRIT: critical conditions

  • LOG_ERR: error conditions

  • LOG_WARNING: warning conditions
  • LOG_NOTICE: normal, but significant, condition

  • LOG_INFO: informational message

  • LOG_DEBUG: debug-level message

A program can use three standard library functions to log a message. They are:

void openlog(const char *ident, int option, int facility);
void syslog(int priority, const char *format, …);
void closelog(void);

A program calls openlog() as soon as it is started; then, it uses syslog() to send log information to syslogd, and calls closelog() just before finishing its execution. The program sets the facility only once, with the facility parameter of the openlog() function, and then decides the importance level of each message, with the priority parameter in syslog(). The function syslog() uses the /dev/log UNIX socket to communicate with syslogd. You can see an example of this approach in Figure 3-1.


Figure 3-1. A diagrammatic representation of the logging process

You can see that Apache and other daemons don’t actually record the files, but talk to syslogd instead. It is syslogd’s responsibility to deal with the log requests.

Configuring syslogd

The syslogd daemon receives the logging requests issued by every running daemon on the system, regardless of the level of importance. Storing every log request onto a single file might lead to a huge and unmanageable log file, full of information of all kinds and levels of importance. Through /etc/syslog.conf you can decide:

  • What messages to consider (facility and level)

  • Where they should be stored

All the other log messages received by syslogd will be ignored. The syslog.conf file (usually found in the /etc directory) looks like this:

# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.*                                    /dev/console

# Log anything (except mail) of level info or higher.
# Don’t log private authentication messages!
*.info;mail.none;authpriv.none;cron.none   /var/log/messages

# The authpriv file has restricted access.
authpriv.*                                 /var/log/secure

# Log all the mail messages in one place.
mail.*                                     /var/log/maillog

# Log cron stuff
cron.*                                     /var/log/cron

# Everybody gets emergency messages
*.emerg                                    *

# Save news errors of level crit and higher in a special file.
uucp,news.crit                             /var/log/spooler

# Save boot messages also to boot.log
local7.*                                   /var/log/boot.log

Each line contains two fields separated by one or more tab characters. The first field, on the left hand side, contains the facility and the level. For example, news.crit means “facility: news, level: critical.” A star symbol (*) means “any.” The second field is the file where the log information will be stored. The following line means that cron messages of any importance are stored in /var/log/cron:

cron.*                                     /var/log/cron

Look at syslogd’s man page (man syslogd and man syslog.conf) for more detailed information on how to configure syslogd.

{mospagebreak title=Logging on a Remote Host}

At this point, you may wonder why syslogd was proposed to perform remote logging, when the syslog() call offers no way to specify a remote server.

The reason is simple: a program (Apache, for instance) always uses the normal Unix domain socket (/dev/log) to log its messages. The syslogd daemon can be configured so that it doesn’t store the received log messages on a local file, but sends them to another syslogd daemon running on the Internet and (obviously) listening to the 514 UDP port. Adding a line in the syslog.conf file can do this:

local0.info @remote_log_server.your_net.com

This forwards all the requests marked with the local0 facility and the info log level to the remote_log_server.your_net.com host. You can see an example of this approach in Figure 3-2. Please note that here, as an example, the syslog daemon A doesn’t write any log files.


Figure 3-2. Syslog’s structure for remote logging

Testing syslogd

The easiest way to test if your logging facility is working well is to use the logger utility (or the equivalent on your system). Suppose that you have an entry like this in the syslogd.conf file:

# Our testing bed
local0.*                               /var/log/apache_book

After modifying the file /etc/syslogd.conf file, remember to send a HUP signal to syslogd:

[root@local_machine root]# killall -HUP syslogd

Now, log a message using the logger utility, like this:

[root@local_machine root]# logger -p local0.crit “Hello readers…”

The /var/log/apache_book file will read like this:

[root@local_machine root]# cat /var/log/apache_book
Oct 6 19:35:31 localhost logger: Hello readers…

Now, modify your syslog.conf in your local machine so that it contains:

local0.*                             @remote_machine

Also modify the syslog.conf in the host remote_machine so that it contains:

local0.*                             /var/log/apache_book

You need to run syslogd with the -r option on the remote machine. Running this command on the local machine:

[root@local_machine root]# logger -p local0.crit “Hello readers…”

will result in this message on the remote machine:

[root@remote_machine root]# cat /var/log/apache_book
Oct 6 19:35:31 local_machine logger: Hello readers…
[root@remote _machine root]#

It worked! All the logger command does is call syslogd(); it doesn’t know what is going to happen to the message. It could be written to a log file, it could be sent to a remote server, or it could even be completely ignored. This is the beauty and the power of the syslogd daemon.

Apache Logging Using syslogd

In this section, I will explain how to configure Apache so that it logs its error log and access log using syslogd.

Logging error_log through syslogd

From Apache 1.3, all you have to do is write syslog where the file name would be written in the httpd.conf file, like this:

LogLevel notice
ErrorLog
syslog

The log levels (listed in Table 3-1) are identical to the ones used by Apache in its error log.

The syslog facility ID used by Apache by default is local7. Therefore, you need to add a line in the syslog.conf file like this:

local7.*                          /var/log/apache_error_log

You can set the facility in the httpd.conf file, writing this instead:

LogLevel notice
ErrorLog
syslog:local0

In this book, I will assume that your Apache uses the facility local7. You should tell syslogd that its configuration file has changed:

[root@localhost root]# killall -HUP syslogd

Restart your Apache daemon:

[root@localhost root]# /usr/local/apache2/bin/apachectl restart

If everything went well, you should see Apache’s log messages in /var/log/apache_error_log:

[root@localhost root]# tail -f /var/log/apache_error_log
Oct 6 20:30:53 localhost httpd[1837]: [notice] Digest: generating secret for digest authentication …
Oct 6 20:30:53 localhost httpd[1837]: [notice] Digest: done
Oct 6 20:30:54 localhost httpd[1837]: [notice] Apache/2.0.40 (Unix) DAV/2
PHP/4.2.3 configured — resuming normal operations

It worked; Apache is now logging its errors through the syslogd daemon. Of course, if you want a remote host to actually store the messages, you have to change the file syslogd.conf to:

local7.*
@remote_log_server.yout_net.com

You also need to restart your syslog daemon.

The server @remote_log_server should preferably be on your own network, and should be configured to store messages from facility local7 on a local file. Of course, the fact that a remote logging server would be receiving the log entries would be totally transparent to Apache. Remember that the syslogd daemon on the server remote_log_server must be started with the -r option.

Logging access_log through syslog

Configuring Apache’s access_log through syslog is less straightforward than doing the same operation with error_log. There is no syslog option for the access_log directive. The reason behind this is that sending access log information to the syslog daemon isn’t something that many users would do, because syslogd is quite slow. If you get 10 requests a second, you might miss something important. Furthermore, it can be configured easily, even if Apache doesn’t support it directly. To do this, you can use a logging program that, instead of writing on a file, sends information to the syslog daemon.

First of all, add the following line to your /etc/syslog.conf file:

local1.info /var/log/apache_access_log

Apache’s httpd.conf file should look like this:

CustomLog “|/usr/bin/logger -p local1.info” common

If your system doesn’t have a logger program, you should be able to find its equivalent quite easily.

Now, restart Apache and to make syslog aware of the configuration changes:

[root@localhost root]# killall -HUP syslogd
[root@localhost root]#
/usr/local/apache2/bin/apachectl restart

Connect to your web server:

[root@localhost root]# telnet localhost 80
Trying 127.0.0.1..
.
Connected to localhost.
Escape character is ‘^]’.
GET / HTTP/1.1
Host: me

HTTP/1.1 200 OK
[...]

telnet> quit
Connection closed.
[root@localhost root]#

The new apache_access_log file should have a log entry like this:

[root@localhost root]# cat /var/log/apache_access_log
Oct 6 21:38:16 localhost logger: 127.0.0.1 – – [06/Oct/2002:21:38:13 +0800]
“GET / HTTP/1.1″ 200 1018

This log means that Apache is now logging its access log through syslog. Of course, you can easily change syslog’s configuration so that the access logs are redirected to a different machine, like this:

local1.info @remote_log_server

It is best to log all the access log messages with an info log level, so they are all of equal importance (unlike the log entries in Apache’s error log).

{mospagebreak title=Advantages and Disadvantages of Logging on a Remote Machine}

Although sending log messages to a remote machine can sound advantageous, there are disadvantages that you need to be aware of. Here are the advantages of logging on a remote machine:

  • A cracker won’t be able to delete or modify the logs after breaking into the system that runs Apache. The cracker could still violate the log server, however. Therefore, the log server should have much heavier security and accept only log requests, to minimize security risks.

  • A cracker won’t be able to perform a log-oriented DOS attack against the server that runs Apache (filling all the disk space with long requests, for instance), because the logs are written elsewhere. A cracker can try to fill up the partition that contains the log files, however, but the direct target server of such an attack wouldn’t be the web server.

  • There are no file permission problems on the log files, because they don’t reside on the local machine. You could argue that the logging machine would have the same issues, but such a machine shouldn’t have any users or external access, and it should be solely dedicated to logging.

  • If you have several servers, you can make sure that all the logs are stored in one spot in the network. This would make organization of backups, log merging, and log analysis much easier.

There are several disadvantages to remote logging using syslog:

  • It is unreliable. A log line could simply get lost on the way to the log server. There is no acknowledgment of any sort by the remote logging server regarding the information being written. An attacker could cause a denial of service on the remote logger by flooding it, or feeding it bad data.

  • Centralizing logging to one server means that the log server represents a single point of failure.

  • It is simple to create fake log entries. syslog’s protocol is based on UDP, and it is extremely simple to send a forged or spoofed UDP packet, because UDP is connectionless. You need to firewall the network carefully, and even then, a cracker might be able to create misleading log entries.

  • It’s based on clear text. This means that the information can be forged on its way to the log server, because there is no reliable mechanism for checking that the packet that arrived is the same as the packet that was sent.

Some of these problems are structural and cannot be easily solved.

Secure Alternatives

The syslog option has pervasive structural vulnerabilities, but studying its problems and vulnerabilities will give you a good start to designing a remote logging architecture, without making the same mistakes again.

Here are alternatives that help to solve some of the shortcomings:

  • syslog-ng (http://www.balabit.hu/en/downloads/syslog-ng/) This is a replacement for the standard syslog daemon. syslog-ng (the ng means next generation) can be configured to support digital signatures and encryption, to make sure that log messages weren’t modified. It can also filter log messages according to their content, and log messages using TCP rather than UDP (TCP is a connection-oriented protocol and is much harder to spoof ). Additionally, it can run in a jailed environment (see Chapter 6 for detailed information on how to run Apache in a “jail” using chroot()).

  • nsyslogd (http://coombs.anu.edu.au/~avalon/nsyslog.html) This is another replacement of the syslog daemon, with some interesting features, including the use of TCP instead of UDP, and the ability to encrypt connections to prevent data tampering using OpenSSL. Unfortunately, while it is a feasible solution for several Unix systems, it doesn’t work well on Linux at the time of this writing.

  • socklog (http://smarden.org/socklog/) This is yet another replacement of the syslog daemon. The main strength of socklog is that it’s based on daemontools (http://cr.yp.to/daemontools.html). daemontools’ architecture makes it possible to have encryption, authentication, compression, and log rotation quite easily. It’s definitely worth looking into.

  • The e-mail available at http://cert.uni-stuttgart.de/archive
    /honeypots/2002/07/msg00100.html
    includes tips on how to hide a remote log server.

Remote Logging without syslogd

Syslog has several limitations. For example, as I mentioned, it is not feasible to store the access log information on syslog on a production server. In this section, I will explain how to configure your server so that it logs encrypted information over the network.

Two Possible Designs

To write logs over the net, you need to use a custom written program to log Apache’s messages on a remote server. You can configure Apache this way:

CustomLog “|/usr/bin/custom_logging_program” common
ErrorLog “|/usr/bin/custom_logging_program”

The program custom_logging_program could be a program that reads the log messages from its standard input, and sends them to a remote server via a TCP connection. The remote log server would be a program written specifically to talk to custom_logging_program. This way, all the problems connected with syslogd could be solved, and you would be absolutely sure that the system works the way you want it to work. There is only one problem: designing a functional client/server application like this can be very complicated. At the beginning of this chapter, I mentioned that logging programs should be kept as simple as possible. A program like this would be anything but simple; it would need to use cryptography, and would need to communicate with the server following all the rules set by the (newly designed) protocol. Writing the logging server program would be even harder.

To simplify the solution’s design, you could use an Apache module that would deal with SQL logging directly. A very good choice, for example, is http://www.grubbybaby.com/mod_log_sql/. The main advantage of this is that it is very easy to configure. You pay for this simplicity in terms of lack of flexibility, because:

  • You cannot encrypt the log information.

  • You can only deal with the access log (not the error log).

  • You can only log using MySQL as a SQL back-end.

Basically, the first solution is powerful, but often too complex to be implemented (reinventing the wheel often means rediscovering the same bugs and vulnerabilities). The second solution is much easier to configure, but has very strong limitations.

{mospagebreak title=A Powerful, Hybrid Design}

I will now propose a hybrid solution, which unites the simplicity of SQL logging with the power of Perl and encryption. The idea is to write a custom logging program, as I mentioned in the previous section. However, rather than implementing a transport protocol from scratch, the logging program will use Perl’s powerful DBD/DBI libraries (Database Driver/Database Interface) in order to access any SQL server and store logging information on a SQL database in the network. I will also use one of Perl’s libraries to encrypt the logging information before sending it to the database server. In this case, symmetric encryption is acceptable, because I will have access to both the encryption and the decryption script (hence, the key won’t have to travel anywhere).

I will use the following components to implement this solution:

  • MySQL as my database server. You can use any database supported by Perl’s DBI drivers, however. For example, many system administrators prefer PostgreSQL.

  • A database table, able to store the logging information.

  • Perl’s DBD drivers for MySQL.

  • Perl’s Crypt::CBC.

  • Perl’s Crypt-Blowfish-2.09.

  • A custom script that encrypts the stored information.

  • A custom script that can read the database and decrypt the information contained in it.

The MySQL Database

You will first need to install MySQL on your system (you can actually follow these instructions with any other database server by marginally changing the Perl code). Then, you can create the database with a mysqladmin command:

[root@merc root]# mysqladmin create apache

This command creates a database called apache. You now need to create a table to store the logging information:

[root@merc root]# mysql apache
Welcome to the MySQL monitor. Commands end with ; or g
.
Your MySQL connection id is 41 to server version: 3.23.54

Type ‘help;’ or ‘h’ for help. Type ‘c’ to clear the buffer.

mysql> CREATE TABLE access_log (
-> sequence int(10) NOT NULL auto_increment,
-> log_line blob,
-> PRIMARY KEY (sequence)
-> ) TYPE=MyISAM;
Query OK, 0 rows affected (0.00 sec)

mysql>

As you can see, the table is extremely simple: it only contains sequence, a column with a sequence number (automatically generated by MySQL), and log_line, the column that will contain the actual log line.

Your database server is now ready to go. I suggest you leave this MySQL session open so that you can see if information actually does get added to your table while testing the script. Note that you should set login and password in order to access your MySQL server.

The Perl Components

You now need to install all the Perl components that will be needed by the script in order to work. They are Crypt-CBC-2.08, Crypt-Blowfish-2.09, and DBD-mysql-2.9002. They can all be found on CPAN (http://www.cpan.org, Comprehensive Perl Archive Network: a site that contains every single third-party Perl module), and they can all be installed with the usual perl Makefile.PL. Here is the installation log for Crypt-CBC:

[root@merc root]# tar xvzf Crypt-CBC-2.08.tar.gz
Crypt-CBC-2.08/
Crypt-CBC-2.08/t/
[...] Crypt-CBC-2.08/MANIFEST
[root@merc root]# cd Crypt-CBC-2.0 8
[root@merc Crypt-CBC-2.08]# perl Makefile.PL
Checking if your kit is complete…
Looks good
Writing Makefile for Crypt::CBC
[root@merc Crypt-CBC-2.08]# make
cp CBC.pm blib/lib/Crypt/CBC.pm
Manifying blib/man3/Crypt::CBC.3pm
[root@merc Crypt-CBC-2.08]# make install
Writing /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi/
auto/Crypt/CBC/.packlist
Appending installation info to /usr/lib/perl5/5.8.0/i386-linux-thread-multi/
perllocal.pod
[root@merc Crypt-CBC-2.08]#

The same installation instructions apply to Crypt-Blowfish and DBD-mysql-2.9002.

NOTE You can install packages from CPAN more easily using the CPAN shell, running perl -MCPAN -e shell, followed by install Crypt::CBC. This way, everything is done magically for you and you don’t really get to know what happens behind the scenes.

The Scripts

You now need to configure Apache so that it pipes the logging information to a program. Here is what your httpd.conf should look like:

CustomLog “|/usr/local/bin/custom_logging_program” common

This is what custom_logging_program should contain:

#!/usr/bin/perl

# Libraries…

#
use strict; # Be strict with coding
use DBI(); # Use DBI drivers
use Crypt::CBC; # Use encryption

# Variables…
#
my($str); # Variable declaration
my($cipher); # Another variable declaration

# Create the cipher object
#
$cipher = Crypt::CBC->new( {‘key’       => ‘my secret key’,
                           ‘cipher‘     => ‘Blowfish’,
                           ‘iv’         => ‘DU#E*UF’,
                           ‘regenerate_key’  => 0,
                           ‘padding’    => ‘space’,
                           ‘prepend_iv’ => 0 
                                     });

# Connect to the database
# my $dbh = DBI->connec(“DBI:mysql:database=apache;
                         host=localhost”, “root”, “”);

# Each log line is fetched and stored into $_…
#
while(<STDIN>){
           $str= $cipher->encrypt($_); # The read line is encrypted…

           # …and stored onto the database
           $dbh->do(“INSERT INTO access_log VALUES (‘0′,”.$dbh->quote(“$str”).”)”);
}

$dbh->disconnect(); # Disconnect from the database

exit(0); # End of the program

The code is well commented. The program first creates a Crypt::CBC object using the Crypt::CBC->new() command (refer to Crypt::CBC’s official documentation for more cipher options, perldoc Crypt::CBC). The program reads its standard input (while(<STDIN>)), encrypts the log line ($str-$cipher->encrypt($));) and stores the encrypted information in the database ($dbh->do(“INSERT INTO access_log VALUES (0,”.$dbh->quote(“$str”).”)”);).

In order to test it, you should do the following:

  1. Delete any records from the database with the following command:

    mysql> delete from access_log;
    Query OK, 0 rows affected (0.00 sec
    )
    mysql>

  2. Stop and restart Apache:

    [root@merc root]# /usr/local/apache2/bin/apachectl stop
    /usr/local/apache2/bin/apachectl stop: httpd stopped [root@merc root]#
    /usr/local/apache2/bin/apachectl start
    /usr/local/apache2/bin/apachectl start: httpd started
  3. Connect to your Apache server requesting some pages using your browser.

  4. Check if any new entries have been created in your database:

    mysql> select * from access_log ;
    +——–+——————————————-
    +
    |sequence| log_line                                  |
    +——–+——————————————-+
    |   1    |??;F??b92B}?Cx{0748}?`?N^?x{018E}?G?”    | |   2    |??x{992B}?C?9[lv?w5I?Tp??Q?x{02EC}c????b?|
    +——–+——————————————-+
    2 rows in set (0.00 sec)
    mysql>

The information stored in the database is not readable, which means it cannot be modified in any meaningful way. To read your logs, you will need to decrypt them using the same algorithm. Here is the program that will fetch and decrypt the log entries:

#!/usr/bin/perl

# Libraries…
#
use strict; # Be strict with coding
use DBI(); # Use DBI drivers
use Crypt::CBC; # Use encryption

# Variables…
#
my($str);     # Variable declaration
my($cipher); # Another variable declaration

# Create the cipher object
#
$cipher = Crypt::CBC->new( {‘key’       => ‘my secret key’,
                            ‘cipher’    => ‘Blowfish’,
                            ‘iv’        =>  ‘DU#E*UF’,
                            ‘regenerate_key’ => 0,
                            ‘padding’   => ‘space’,
                            ‘prepend_iv’=> 0
                           });

# Connect to the database
#
my $dbh = DBI->connec (“DBI:mysql:database=apache;
                        host=localhost”, “root”, “”);

# Prepare the SQL query
#
my $sth = $dbh->prepare(“SELECT * FROM access_log”);
$sth->execute();

# Main cycle to read the information
#
while (my $ref = $sth->fetchrow_hashref()) {
        $str= $cipher->decrypt($ref->{‘log_line’});
        print($str);
}
$sth->finish();     # End of row-fetching
$dbh->disconnect(); # Disconnect from the database

exit(0); # End of the program

This program is very similar to the previous one. The difference is that the information is fetched from the database (while (my $ref = $sth->fetchrow_hashref()) {), and that it is decrypted ($str= $cipher->decrypt($ref->{‘log_line’});).

Once your log information has been fetched by this script, you can feed it to “classic” web analyzing tools and store it in a secure location.

{mospagebreak title=Room for Improvement}

This is only a starting point. The scripts I provided do work, but there are many features that can—and should—be added. For example:

  • Error management in the source code. The scripts are very basic, and critical error conditions are not tested. This means that if the database server is down, there will be no logging—and you won’t be aware of it.

  • Password management. What happens if you encrypt different rows in your database using different passwords? You will need to make sure you have a mechanism in order to manage this situation. An example could be an extra numeric field that stores the “password number” (meaning: password #1, password #2, and so on) and making the decrypting script aware of those passwords.

  • Field separation. It would be a good idea to store different pieces of logging information in separate fields. This would lead to powerful log analysis. Remember that you should always make sure that your decrypting script is able to generate a common log text file.

  • Error log. The same script can be used for the error log. In this case, field separation would obviously be pointless.

However, the example I provided should be enough for you to understand the potential of such a solution.

Checkpoints
  • Be aware of all your logging options (and problems), and set an ideal environment to enable proper logging regardless of the solution you use. Also clearly document the logging architecture (even if it uses normal files).

  • Check logs regularly or delegate a program to do so; notify the offenders whenever possible.

  • Minimize the number of entries in the error_log. This might mean notifying CGI authors of warnings, notifying referring webmasters that links have changed, and so on.

  • Make sure that there is always enough space for log files (automatic software helps by notifying you of low disk space situations).

  • If your environment is critical or attack-prone, log onto a remote machine and encrypt the logging information. In this case, be aware of all the pros and cons of every single remote-logging solution, and try to keep it simple.
[gp-comments width="770" linklove="off" ]

antalya escort bayan antalya escort bayan Antalya escort diyarbakir escort