Ever wondered if you could be emailed automatically whenever yourfavorite Web pages changed? Our intrepid developer didn't just wonder -he sat down and wrote some code to make it happen. Here's his story.

October 23, 2002

With the theory out of the way, I was just about ready to make my first stab at the code. Since I was told that there are a large number of URLs to be monitored, I decided to use a MySQL database table to store them, in addition to a brief description of each URL. The table I came up with is pretty simple - here's what it looked like:

id tinyint(3) unsigned NOT NULL auto_increment,
url text NOT NULL,
dsc varchar(255) NOT NULL default '',
date datetime default NULL,
email varchar(255) NOT NULL default '',
And here's a sample of the data within it:
mysql> select * from urls;
| id | url                             | dsc            | date | email
|  1 | http://www.melonfire.com/       | Melonfire.com  | NULL |
user@some.domain |
|  2 | http://www.yahoo.com/           | Yahoo.com      | NULL |
user@some.domain |
|  3 | http://www.devshed.com/         | Devshed.com    | NULL |
user@some.domain |
3 rows in set (0.00 sec)
Next up, I needed a script that would iterate through this database table, connect to each of the URLs listed within it, and obtain the value of the "Last-Modified" header - basically, replicate what I did with my telnet client, as many times as there were URLs. Here's what I put together:
// DB connection parameters
// open database connection
$connection = mysql_connect($db_host, $db_user, $db_pass) or die
("Unable to connect!"); mysql_select_db($db_name);
// generate and execute query
$query1 = "SELECT id, url, date, dsc, email FROM urls"; $result1 =
mysql_query($query1, $connection) or die ("Error in query: $query1 . " .
// if rows exist
if (mysql_num_rows($result1) > 0)
// iterate through resultset
while(list($id, $url, $date, $desc, $email) =
$response = "";
// parse URL into component parts  
$arr = parse_url($url);
// open a client connection
$fp = fsockopen ($arr['host'], 80);
// send HEAD request and read response
$request = "HEAD /" . $arr['path'] . "
fputs ($fp, $request);
while (!feof($fp)) 
$response .= fgets ($fp, 500);
fclose ($fp);
// split response into lines
$lines = explode("\r\n", $response);
// scan lines for "Last-Modified" header
foreach($lines as $l)
if (ereg("^Last-Modified:", $l)) 
// split into variable-value component
$arr2 = explode(": ", $l);
$newDate = gmdate("Y-m-d H:i:s",
// if date has changed from
last-recorded date
if ($date != $newDate)
// send mail to owner 
mail($email, "$desc has
changed!", "This is an automated message to inform you that the URL
\r\n\r\n $url \r\n\r\nhas changed since it was last checked. Please
visit the URL to view the changes.", "From: The Web Watcher
<nobody@some.domain>") or die ("Could not send mail!");
// update table with new date
$query2 = "UPDATE urls SET date
= '" . $newDate . "' WHERE id = '" . $id . "'";
$result2 = mysql_query($query2,
$connection) or die ("Error in query: $query2 . " . mysql_error());
// close database connection
How does this work? Let's look at that next.

