PHP
  Home arrow PHP arrow Watching The Web
Dev Shed Forums  
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Smartphone Development  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Mobile Linux  
App Generation ROI  
IBM® developerWorks  
Forums Sitemap  
E-Commerce Hosting  
Linux Web Hosting  
Managed Hosting  
Small Business Hosting  
VPS Hosting  
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid  
Request Media Kit
Contact Us  
Site Map  
Privacy Policy  
Support  
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
PHP

Watching The Web
By: The Disenchanted Developer, (c) Melonfire
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: starstarstarstarstar / 7
    2002-10-23


    Table of Contents:
  • Watching The Web
  • Code Poet
  • Digging Deep
  • Backtracking
  • Plan B
  • Closing Time

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      error-file:tidyout.log Del.ici.ous error-file:tidyout.log Digg
      error-file:tidyout.log Blink error-file:tidyout.log Simpy
      error-file:tidyout.log Google error-file:tidyout.log Spurl
      error-file:tidyout.log Y! MyWeb error-file:tidyout.log Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article

     
     
    ADVERTISEMENT


    Watching The Web
    ( Page 1 of 6 )

    Ever wondered if you could be emailed automatically whenever your favorite Web pages changed? Our intrepid developer didn't just wonder - he sat down and wrote some code to make it happen. Here's his story.So there I was, minding my own business, working on a piece of code I had to deliver that evening, when the pretty dark-haired girl who sits in the cubicle behind me popped her head over and asked for my help.

    "Look", she said, "I need your help with something. Can you write me a little piece of code that keeps track of Web site URLs and tells me when they change?"

    "Huh?", was my first reaction...

    "It's like this", she explained, "As part of a content update contract, I'm in charge of tracking changes to about thirty different Web sites for a customer, and sending out a bulletin with those changes. Every day, I spend the morning visiting each site and checking to see if it's changed. It's very tedious, and it really screws up my day. Do you think you can write something to automate it for me?"

    Now, she's a pretty girl...and the problem intrigued me. So I agreed.{mospagebreak title=A Little Research} The problem, of course, appeared when I actually started work on her request. I had a vague idea how this might work: all I had to do, I reasoned, was write a little script that woke up each morning, scanned her list of URLs, downloaded the contents of each, compared those contents with the versions downloaded previously, and sent out an email alert if there was a change.

    Seemed simple - but how hard would it be to implement? I didn't really like the thought of downloading and saving different versions of each page on a daily basis, or of creating a comparison algorithm to test Web pages against each other.

    I thought there ought to be an easier way. Maybe the Web server had a way of telling me if a Web page had been modified recently - and all I had to do was read that data and use it in a script. Accordingly, my first step was to hit the W3C Web site, download a copy of the HTTP protocol specification, from ftp://ftp.isi.edu/in-notes/rfc2616.txt, and print it out for a little bedside reading. Here's what I found, halfway through: The Last-Modified entity-header field indicates the date and time at which the origin server believes the variant was last modified.There we go, I thought - the guys who came up with the protocol obviously anticipated this requirement and built it into the protocol headers. Now to see if it worked...

    The next day at work, I fired up my trusty telnet client and tried to connect to our intranet Web server and request a page. Here's the session dump:
    $ telnet darkstar 80
    Trying 192.168.0.10...
    Connected to darkstar.melonfire.com.
    Escape character is '^]'.
    HEAD / HTTP/1.0
    
    HTTP/1.1 200 OK
    Date: Fri, 18 Oct 2002 08:47:57 GMT
    Server: Apache/1.3.26 (Unix) PHP/4.2.2
    Last-Modified: Wed, 09 Oct 2002 11:27:23 GMT
    Accept-Ranges: bytes
    Content-Length: 1446
    Connection: close
    Content-Type: text/html
    
    Connection closed by foreign host.
    As you can see, the Web server returned a "Last-Modified" header indicating the date of last change of the requested file. So far so good.

     
     
    >>> More PHP Articles          >>> More By The Disenchanted Developer, (c) Melonfire
     

       

    PHP ARTICLES

    - Using Directory Iterators to Build Loader Ap...
    - Using the spl_autoload() Functions to Build ...
    - Working Out of the Object Context to Build L...
    - Using the _autoload() Magic Function to Buil...
    - The Destruct Magic Function in PHP 5
    - The Autoload Magic Function in PHP 5
    - Developing a Recursive Loading Class for Loa...
    - The Sleep and Wakeup Magic Functions in PHP 5
    - Using the Clone Magic Function in PHP 5
    - Including Files Recursively with Loader Appl...
    - The Call Magic Function in PHP 5
    - Designing a Captcha System with PHP and MySQL
    - Using Static Methods to Build Loader Apps in...
    - The Isset and Unset Magic Functions in PHP 5
    - Advanced PHP Form Input Validation to Check ...





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 4 hosted by Hostway
    Stay green...Green IT