Home arrow Perl Programming arrow Web Access with LWP

Web Access with LWP

There are a number of ways you can retrieve information from the web. You can access it directly via a browser, or you can write a script that gets the information for you and delivers it in a form you can use. The LWP library for Perl can help you with the latter. Keep reading for a closer look.

TABLE OF CONTENTS:
  1. Web Access with LWP
  2. Making Requests
  3. Making it Work
  4. Getting the Weather
By: Peyton McCullough
Rating: starstarstarstarstar / 4
May 14, 2009

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

The Web is a wonderful resource. It contains a wealth of information about nearly every conceivable topic, and in order to access much of that information, you only need a Web browser, of which there are several to choose from. For example, if I need to check the weather before I head outside to figure out what I should wear, I can simply navigate to the appropriate page, enter my city, and I'll be presented with the current weather conditions. Or, if I want to find information about a given movie, I need only Google it, or look it up on IMDB.

This is fine when I'm going to consume the information directly and in its native format. However, what if I want to write a program that accesses this information? Say, for example, that I wanted to record the information, or transform it in some way. 

This is a common task, and accessing information on the Web is actually fairly easy. In fact, there are a number of libraries that can do the job. In this article, we'll be taking a brief look at LWP, a Perl library designed for Web access. 

The LWP library actually has a number of interesting and complex features, but in this article, we're only going to look at the basics—enough to design a simple Perl script that can request and receive data from the Web.

Starting out Simple 

As it turns out, LWP provides easy access to the most basic functionality. If all you want to do is retrieve the raw content of a document, LWP makes this straightforward by providing access to a few basic functions contained in LWP::Simple. 

The simplest thing to do is probably to just get the content of a document. This can be done using the appropriately-named get function. This function takes one argument: the URL of the document to be requested. It returns the content of the document. 

So, if we wanted to get the content of Google's index page, we would only need to make one function call and then print the result to the screen. Let's go ahead and create a short script that does just that:

 

#!/usr/bin/perl

use strict;use LWP::Simple;print get('http://google.com');

 

The above script is pretty straightforward. As you can see, there's not much to it. 

Printing the content of a page isn't an uncommon task, though, and LWP::Simple actually provides a function that both fetches a document's content and prints it to STDOUT. The function is called getprint and accepts one argument, which is the URL of the document to get, just like the get function. So, we could change the previous script's last line to this, and the result would be the same:

 

getprint('http://google.com');

 

If we want to store the document's contents in a file, we could change STDOUT and then call getprint. However, LWP::Simple also provides a function called getstore, which stores the content of a URL in a given file. The first argument is the URL, and the second is the file. In order to store the Google index page in a file called google.html, we'd make the following call:

 

getstore('http://google.com', 'google.html');

 

Sometimes, though, it only makes sense to store a document if it's been updated. We can do this with the mirror function, which takes the same arguments as the getstore function:

 

mirror('http://google.com', 'google.html');

 

The getprint, getstore and mirror functions do one additional thing. They return the HTTP response code, which in some cases is very useful. These can be checked against constants defined by the library. For example, below we check to see if everything went well:

 

my $response_code = getprint('http://google.com');print "nOKn" if ($response_code == RC_OK);

 

As you can see, the LWP::Simple module makes common tasks very easy, as its name suggests. 



 
 
>>> More Perl Programming Articles          >>> More By Peyton McCullough
 

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort
   

PERL PROGRAMMING ARTICLES

- Perl Turns 25
- Lists and Arguments in Perl
- Variables and Arguments in Perl
- Understanding Scope and Packages in Perl
- Arguments and Return Values in Perl
- Invoking Perl Subroutines and Functions
- Subroutines and Functions in Perl
- Perl Basics: Writing and Debugging Programs
- Structure and Statements in Perl
- First Steps in Perl
- Completing Regular Expression Basics
- Modifiers, Boundaries, and Regular Expressio...
- Quantifiers and Other Regular Expression Bas...
- Parsing and Regular Expression Basics
- Hash Functions

Developer Shed Affiliates

 


Dev Shed Tutorial Topics: