There are a number of ways you can retrieve information from the web. You can access it directly via a browser, or you can write a script that gets the information for you and delivers it in a form you can use. The LWP library for Perl can help you with the latter. Keep reading for a closer look.
Now that we have a request object made, we need to actually make the request to the server. This is done using the LWP::UserAgent module. The user agent puts everything together and makes it all work. It's the thing that actually communicates with the target Web server.
To make the request, we need to first create a LWP::UserAgent object. Then, we need to call the user agent's request method, passing the request object as an argument. This is all really easy to do. The request method will then return a response, which we'll get to shortly. Let's return to the Google example here and actually retrieve the index page. Here's how this is done:
use HTTP::Request;use LWP::UserAgent;# Make the request objectmy$request= new HTTP::Request(GET =>'http://google.com');# Create the user agent and make the actual requestmy$ua= new LWP::UserAgent;my$response=$ua->request($request);
Notice how the request is the same as before. The only thing we've done is added two lines that work with the user agent.
Now that we have the response, how do we actually extract the content of the page? Before we do this, we'll want to make sure that everything was successful by checking is_success. If things were successful, then the content of the response is located in content. Let's add to our script, making it print out the content:
# Print out the contentprint$response->content ."n"if($response->is_success);
When you run the script, you should see the source of the Google index page printed out to the screen.
The script certainly works, but it's a real pain to create a request object only to use it one time. Fortunately, LWP::UserAgent provides some shortcuts. Instead of creating an HTTP::Request on our own, we can have LWP::UserAgent create one for us. The module provides two methods, get and post, that can do much of the work for you.
These methods can take a number of arguments, but since we don't have space to cover all of them in this article, we'll just take a look at the basic functionality provided by them. The first argument of both methods, and the only required one, is the URL. This is the only argument we need to pass in order to rewrite the Google index page script. Let's do that now:
my$ua= new LWP::UserAgent;my$response=$ua->get('http://google.com');print$response->content ."n"if($response->is_success);
As you can see, this script is much shorter than the last one, and it's easier to read. Notice how the get method returns a response object, just as the request method does. The post method operates the same way.