Home arrow Perl Programming arrow Web Mining with Perl

Web Mining with Perl

It is common knowledge that the Internet is a great data source. It is alsocommon knowledge that it is difficult to get the information you want in the format you need. No longer.

TABLE OF CONTENTS:
  1. Web Mining with Perl
  2. Accessing The Net (LWP)
  3. Cut Along The Table Lines (HTML::TableExtract)
  4. Learning From Links (HTML::LinkExtor)
  5. Checking For Sameness (String::CRC)
  6. Bringing It All Together
  7. Conclusion
By: Tommie Jones
Rating: starstarstarstarstar / 54
March 05, 2002

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement
Any organization that spends money for marketing research or generating sales leads can benefit from building a web crawler. Instead of spending tens of thousands of dollars for a boxed market research survey, a web crawler can be used to ferret information from the web.

For example: 1. Retail oriented companies can build web crawlers to find trends mentioned in web logs. 2. Software consulting companies could crawl industry specific news groups and mailing lists for potential customers asking for advice. 3. Job placement services could search company sites for an increase in Job postings.

All of these tasks can be accomplished with creative use of Perl and it's abundance of CPAN (Comprehensive Perl Archive - the repository of Perl module/libraries) modules. In this article the main topic will include some of the CPAN modules available and how they can be used to accomplish tasks similar to the ones above.

Why Perl? Why not? Perl is an excellent tool for a web mining project. Perl's basic but powerful built-in data structures, easily accessible regular expressions and large selection of CPAN modules show that Perl easily meets the application's requirements.

The rest of this article will discuss some CPAN Modules that will be useful when building a Perl-based web crawler.

 
 
>>> More Perl Programming Articles          >>> More By Tommie Jones
 

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort
   

PERL PROGRAMMING ARTICLES

- Perl Turns 25
- Lists and Arguments in Perl
- Variables and Arguments in Perl
- Understanding Scope and Packages in Perl
- Arguments and Return Values in Perl
- Invoking Perl Subroutines and Functions
- Subroutines and Functions in Perl
- Perl Basics: Writing and Debugging Programs
- Structure and Statements in Perl
- First Steps in Perl
- Completing Regular Expression Basics
- Modifiers, Boundaries, and Regular Expressio...
- Quantifiers and Other Regular Expression Bas...
- Parsing and Regular Expression Basics
- Hash Functions

Developer Shed Affiliates

 


Dev Shed Tutorial Topics: