Perl
  Home arrow Perl arrow Page 5 - Web Mining with Perl
Dev Shed Forums  
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Smartphone Development  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Mobile Linux  
App Generation ROI  
IBM® developerWorks  
Forums Sitemap  
E-Commerce Hosting  
Linux Web Hosting  
Managed Hosting  
Small Business Hosting  
VPS Hosting  
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid  
Request Media Kit
Contact Us  
Site Map  
Privacy Policy  
Support  
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
PERL

Web Mining with Perl
By: Tommie Jones
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: starstarstarstarstar / 50
    2002-03-05


    Table of Contents:
  • Web Mining with Perl
  • Accessing The Net (LWP)
  • Cut Along The Table Lines (HTML::TableExtract)
  • Learning From Links (HTML::LinkExtor)
  • Checking For Sameness (String::CRC)
  • Bringing It All Together
  • Conclusion

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      error-file:tidyout.log Del.ici.ous error-file:tidyout.log Digg
      error-file:tidyout.log Blink error-file:tidyout.log Simpy
      error-file:tidyout.log Google error-file:tidyout.log Spurl
      error-file:tidyout.log Y! MyWeb error-file:tidyout.log Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article

     
     
    ADVERTISEMENT


    Web Mining with Perl - Checking For Sameness (String::CRC)
    ( Page 5 of 7 )

    String::CRC is a simple and little known module that provides simple checksum support. Checksums are often used as sanity checks. Given a string of text they generate a number. Doing small modifications to the string drastically changes the value of the checksum. That is not to say that a checksum is unique for every string. What is important for a checksum is that a minor change in the string requires a drastically changed string.

    How would a checksum be used? An example would be if you transfer a file from one machine to another and were not sure the file had been corrupted. A checksum can be run on the original file and the file at its new location. If the checksums are the same then the transfer can be considered successful. It would be very unlikely that a file could be corrupted (by accident) and still generate the same checksum.

    Here is an example of String::CRC in action:

    #!/usr/bin/perl use String::CRC; my $str = " some text string "; my ($crc) = crc($str, 32); print "Check sum $str -> $crc\n"; $str = $str . " "; $crc = crc($str, 32); print "Check sum $str -> $crc\n";
    By running this script you will see just adding an additional white space can significantly change the result of crc.

     
     
    >>> More Perl Articles          >>> More By Tommie Jones
     

       

    PERL ARTICLES

    - More Perl Bits
    - Perl, Bit by Bit
    - Basic Charting with Perl
    - Using Getopt::Long: More Command Line Option...
    - Command Line Options in Perl: Using Getopt::...
    - Web Access with LWP
    - More Templating Tools for Perl
    - Site Layout with Perl Templating Tools
    - Build a Perl RSS Aggregator with Templating ...
    - Looping, Security, and Templating Tools
    - Perl: Bon Voyage Lists and Hashes
    - Templating Tools
    - Perl: Number Crunching
    - Perl Debuggers in Detail
    - Debugging Perl





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 2 Hosted by Hostway
    Stay green...Green IT