LWP, which stands for the libwww-Perl library, is a common module that may have comes with most installations of Perl. LWP (as quoted from the LWP perldoc) is a collection of Perl modules that provide a consistent and simple application-programming interface to the World Wide Web. LWP provides support for redirection, cookies, basic authentication and robot.txt parsing. For the majority of web-crawling requirements a developer can use LWP::Simple. LWP-Simple allows the developer to store the head or body of a web page (given its URL) in a scalar variable or file. Here is an example.
#Store the output of the web page (html and all) in content
my $content = get("http://www.yahoo.com");
if (defined $content)
#$content will contain the html associated with the url mentioned above.
#If an error occurs then $content will not be defined.
print "Error: Get failed";
After loading the LWP::Simple module with the use command the
get subroutine is called to download the html on the http://www.yahoo.com web site. The html is stored in the $content variable. If there is not an error the $content value is printed to standard output.
Other modules exist in the LWP::Bundle that handle cookies, automatic redirection and other things. For more information please read the perldoc on LWP::RobotUA and LWP::UserAgent.