Web Mining with Perl - Checking For Sameness (String::CRC) (
Page 5 of 7 )
String::CRC is a simple and little
known module that provides simple checksum support. Checksums are often used as
sanity checks. Given a string of text they generate a number. Doing small
modifications to the string drastically changes the value of the checksum. That
is not to say that a checksum is unique for every string. What is important for
a checksum is that a minor change in the string requires a drastically changed
string.
How would a checksum be used? An example would be if you
transfer a file from one machine to another and were not sure the file had been
corrupted. A checksum can be run on the original file and the file at its new
location. If the checksums are the same then the transfer can be considered
successful. It would be very unlikely that a file could be corrupted (by
accident) and still generate the same checksum.
Here is an example of
String::CRC in action:
#!/usr/bin/perl
use String::CRC;
my $str = " some text string ";
my ($crc) = crc($str, 32);
print "Check sum $str -> $crc\n";
$str = $str . " ";
$crc = crc($str, 32);
print "Check sum $str -> $crc\n";
By running this script you will see just adding an additional
white space can significantly change the result of crc.