Now, while all this is well and good, you're probably wondering just how useful all this is. Well, we've put together a little case study for you, based on our real-world experiences, which should help to put everything you've learned today in context [do note that for purposes of illustration, we've over-simplified some aspects of the case study, and completely omitted others].
As a content production house, we have a substantial content catalog on a variety of subjects, much of it stored in ASCII text files for easy retrieval. Some time ago, we decided to use sections of this content on a customer's Web site, and so had to come up with a method of reading our data files, and generating HTML output from them. Since Perl is particularly suited to text processing, we decided to use it to turn our raw data into the output the customer needed.
Here's a sample raw data file, which contains a movie review and related information - let's call it "sample.dat":
Al Pacino and Russell Crowe
This is the body of the review.
It could consist of multiple paragraphs.
We're not including the entire review here...you've probably already seen
the movie. If you haven't, you should!
Within the data block, there are two main components - the
header, which contains information about the title, the cast and director, and the length; and the review itself. We know that the first six lines of the file are restricted to header information, and anything following those six lines can be considered a part of the review.
And here's the script we wrote to extract data from this file, HTML-ize it and display it:
# open the data file
open(DATA, "sample.dat") || die ("Unable to open file!\n");
# read it into an array
@data = <DATA>;
# clean it up - remove the line breaks
foreach $line (@data)
# extract the header - the first six lines
@header = splice(@data, 0, 6);
# join the remaining data with HTML line breaks
$review = join("<BR>", @data);
# print output
print "\"$header\" stars $header and was directed by $header.\n";
print "Here's what we thought:\n<P>\n";
print "Length: $header minutes\n<BR>\n";
print "Our rating: $header\n<BR>\n";
And here's the output:
"The Insider" stars Al Pacino and Russell Crowe and was directed by Michael
Here's what we thought:
This is the body of the review.<BR><BR>It could consist of multiple
paragraphs.<BR><BR>We're not including the entire review here...you've
probably already seen the movie. If you haven't, you should!<BR>
Length: 178.00 minutes
Our rating: 5
As you can see, Perl makes it easy to extract data from a
file, massage it into the format you want, and use it to generate another file.
This article copyright Melonfire 2000. All rights reserved.