Using Perl with XML (part 1) - Let's Talk About SAX (
Page 3 of 7 )
The first of these approaches is SAX, the Simple API
for XML. A SAX parser works by traversing an XML document and calling specific
functions as it encounters different types of tags. For example, I might call a
specific function to process a starting tag, another function to process an
ending tag, and a third function to process the data between them.
The
parser's responsibility is simply to parse the document; the functions it calls
are responsible for processing the tags found. Once the tag is processed, the
parser moves on to the next element in the document, and the process repeats
itself.
Perl comes with a SAX parser based on the expat library created
by James Clark; it's implemented as a Perl package named XML::Parser, and
currently maintained by Clark Cooper. If you don't already have it, you should
download and install it before proceeding further; you can get a copy
from CPAN (
http://www.cpan.org/).
I'll begin by
putting together a simple XML file:
<?xml version="1.0"?>
<library>
<book>
<title>Dreamcatcher</title>
<author>Stephen King</author>
<genre>Horror</genre>
<pages>899</pages>
<price>23.99</price>
<rating>5</rating>
</book>
<book>
<title>Mystic River</title>
<author>Dennis Lehane</author>
<genre>Thriller</genre>
<pages>390</pages>
<price>17.49</price>
<rating>4</rating>
</book>
<book>
<title>The Lord Of The Rings</title>
<author>J. R. R. Tolkien</author>
<genre>Fantasy</genre>
<pages>3489</pages>
<price>10.99</price>
<rating>5</rating>
</book>
</library>
Once my data is in XML-compliant format, I need to decide
what I'd like the final output to look like.
Let's say I want it to look
like this:

As you can see, this is a simple
table containing columns for the book title, author, price and rating. (I'm not
using all the information in the XML file). The title of the book is printed in
italics, while the numerical rating is converted into something more
readable.
Next, I'll write some Perl code to take care of this for
me.