Watching The Web - Backtracking (
Page 4 of 6 )
So far, it looks
like everything's hunky-dory - but being the suspicious character I am, I
thought it might be worth trying the code out against a few servers before
accepting the script above as a reliable tool. And that's when I hit my first
roadblock - as it turned out, some servers didn't return the "Last-Modified"
header, which meant that the script couldn't determine when the page had been
last modified.
I thought this was pretty strange, as it seemed to be a
violation of the rules laid down in the HTTP protocol. Back to the
specification, then, to see if I could resolve this apparent
conflict...
A little close reading, and the reason for the discrepancy
became clear:
HTTP/1.1 servers SHOULD send Last-Modified
whenever feasible.In other words - they don't *have to*. And
there's many a slip betwixt the cup and the lip...
OK, maybe I should
have read the fine print before writing that script. Still, better late than
never.
Back to the drawing board, then. After a little thought and a few
carefully-posed questions to the PHP mailing lists, it seemed that my initial
plan was still the most reliable - download and store the contents of each URL,
and compare those contents against the previous version to see if there was any
change. This wasn't the most efficient way to do it - but it didn't look like I
had any alternatives.