Home arrow PHP arrow Watching The Web

Watching The Web

Ever wondered if you could be emailed automatically whenever yourfavorite Web pages changed? Our intrepid developer didn't just wonder -he sat down and wrote some code to make it happen. Here's his story.

  1. Watching The Web
  2. Code Poet
  3. Digging Deep
  4. Backtracking
  5. Plan B
  6. Closing Time
By: The Disenchanted Developer, (c) Melonfire
Rating: starstarstarstarstar / 7
October 23, 2002

print this article


So there I was, minding my own business, working on a piece of code I had to deliver that evening, when the pretty dark-haired girl who sits in the cubicle behind me popped her head over and asked for my help.

"Look", she said, "I need your help with something. Can you write me a little piece of code that keeps track of Web site URLs and tells me when they change?"

"Huh?", was my first reaction...

"It's like this", she explained, "As part of a content update contract, I'm in charge of tracking changes to about thirty different Web sites for a customer, and sending out a bulletin with those changes. Every day, I spend the morning visiting each site and checking to see if it's changed. It's very tedious, and it really screws up my day. Do you think you can write something to automate it for me?"

Now, she's a pretty girl...and the problem intrigued me. So I agreed.{mospagebreak title=A Little Research} The problem, of course, appeared when I actually started work on her request. I had a vague idea how this might work: all I had to do, I reasoned, was write a little script that woke up each morning, scanned her list of URLs, downloaded the contents of each, compared those contents with the versions downloaded previously, and sent out an email alert if there was a change.

Seemed simple - but how hard would it be to implement? I didn't really like the thought of downloading and saving different versions of each page on a daily basis, or of creating a comparison algorithm to test Web pages against each other.

I thought there ought to be an easier way. Maybe the Web server had a way of telling me if a Web page had been modified recently - and all I had to do was read that data and use it in a script. Accordingly, my first step was to hit the W3C Web site, download a copy of the HTTP protocol specification, from ftp://ftp.isi.edu/in-notes/rfc2616.txt, and print it out for a little bedside reading. Here's what I found, halfway through: The Last-Modified entity-header field indicates the date and time at which the origin server believes the variant was last modified.There we go, I thought - the guys who came up with the protocol obviously anticipated this requirement and built it into the protocol headers. Now to see if it worked...

The next day at work, I fired up my trusty telnet client and tried to connect to our intranet Web server and request a page. Here's the session dump:
$ telnet darkstar 80
Connected to darkstar.melonfire.com.
Escape character is '^]'.
HTTP/1.1 200 OK
Date: Fri, 18 Oct 2002 08:47:57 GMT
Server: Apache/1.3.26 (Unix) PHP/4.2.2
Last-Modified: Wed, 09 Oct 2002 11:27:23 GMT
Accept-Ranges: bytes
Content-Length: 1446
Connection: close
Content-Type: text/html
Connection closed by foreign host.
As you can see, the Web server returned a "Last-Modified" header indicating the date of last change of the requested file. So far so good.

>>> More PHP Articles          >>> More By The Disenchanted Developer, (c) Melonfire

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort


- Hackers Compromise PHP Sites to Launch Attac...
- Red Hat, Zend Form OpenShift PaaS Alliance
- PHP IDE News
- BCD, Zend Extend PHP Partnership
- PHP FAQ Highlight
- PHP Creator Didn't Set Out to Create a Langu...
- PHP Trends Revealed in Zend Study
- PHP: Best Methods for Running Scheduled Jobs
- PHP Array Functions: array_change_key_case
- PHP array_combine Function
- PHP array_chunk Function
- PHP Closures as View Helpers: Lazy-Loading F...
- Using PHP Closures as View Helpers
- PHP File and Operating System Program Execut...
- PHP: Effects of Wrapping Code in Class Const...

Developer Shed Affiliates


Dev Shed Tutorial Topics: