They say there's more than one way to skin a cat - and that'stwice as true when you're a Perl developer. In this concluding article onXML parsing with Perl, find out how the XML::DOM package provides analternative technique for manipulating XML elements and attributes, andcompare the two approaches to see which one works best for you.
I can do the same thing with the second example as well. However, since there are quite a few levels to the document tree, I've decided to use a recursive function to iterate through the tree, rather than a series of "if" statements.
Here's the XML file,
<?xml version="1.0"?>
<recipe>
<name>Chicken Tikka</name>
<author>Anonymous</author>
<date>1 June 1999</date>
<ingredients>
<item>
<desc>Boneless chicken breasts</desc>
<quantity>2</quantity>
</item>
<item>
<desc>Chopped onions</desc>
<quantity>2</quantity>
</item>
<item>
<desc>Ginger</desc>
<quantity>1 tsp</quantity>
</item>
<item>
<desc>Garlic</desc>
<quantity>1 tsp</quantity>
</item>
<item>
<desc>Red chili powder</desc>
<quantity>1 tsp</quantity>
</item>
<item>
<desc>Coriander seeds</desc>
<quantity>1 tsp</quantity>
</item>
<item>
<desc>Lime juice</desc>
<quantity>2 tbsp</quantity>
</item>
<item>
<desc>Butter</desc>
<quantity>1 tbsp</quantity>
</item>
</ingredients>
<servings>
3
</servings>
<process>
<step>Cut chicken into cubes, wash and apply lime juice and salt</step>
<step>Add ginger, garlic, chili, coriander and lime juice in a separate
bowl</step>
<step>Mix well, and add chicken to marinate for 3-4 hours</step>
<step>Place chicken pieces on skewers and barbeque</step>
<step>Remove, apply butter, and barbeque again until meat is tender</step>
<step>Garnish with lemon and chopped onions</step>
</process>
</recipe>
and here's the script which parses it.
#!/usr/bin/perl
# XML file$file = "recipe.xml";# hash of tag names mapped to HTML markup# "recipe" => start a new block# "name" => in bold# "ingredients" => unordered list# "desc" => list items# "process" => ordered list# "step" => list items%startTags = ("name" => "<font size=+2>","date" => "<i>(","author" => "<b>","servings" => "<i>Serves ","ingredients" => "<h3>Ingredients:</h3><ul>","desc" => "<li>","quantity" => "(","process" => "<h3>Preparation:</h3><ol>","step" => "<li>");# close tags opened above%endTags = ("name" => "</font><br>","date" => ")</i>","author" => "</b>","ingredients" => "</ul>","quantity" => ")","servings" => "</i>","process" => "</ol>");# this function accepts an array of nodes as argument,# iterates through it and prints HTML markup for each tag it finds.# for each node in the array, it then gets an array of the node's children,and# calls itself again with the array as argument (recursion)sub printData(){ my (@nodeCollection) = @_; foreach $node (@nodeCollection) { print $startTags{$node->getNodeName()}; print $node->getFirstChild()->getData(); my @children = &getChildren($node); printData(@children); print $endTags{$node->getNodeName()}; }}# this function accepts a node# and returns all the element nodes under it (its children)# as an arraysub getChildren(){ my ($node) = @_; # get children of this node my @temp = $node->getChildNodes(); my $count = 0; my @collection; # iterate through children foreach $item (@temp) { # if this is an element # (need this to strip out text nodes containing whitespace) if ($item->getNodeType() == 1) { # add it to the @collection array $collection[$count] = $item; $count++; } } # return node collection return @collection;}use XML::DOM;# instantiate parser$xp = new XML::DOM::Parser();# parse and create tree$doc = $xp->parsefile($file);# send standard header to browserprint "Content-Type: text/html\n\n";# print HTML headerprint "<html><head></head><body><hr>";# get root node$root = $doc->getDocumentElement();# get children@children = &getChildren($root);# run a recursive function starting here&printData(@children);print "</table></body></html>";# end
In this case, I've utilized a slightly different method to mark up the XML. I've first initialized a couple of hashes to map XML tags to corresponding HTML markup, in much the same manner as I did last time. Next, I've used DOM functions to obtain a reference to the first set of child nodes in the DOM tree.
This initial array of child nodes is used to "seed" my printData() function, a recursive function which takes an array of child nodes, matches their tag names to values in the associative arrays, and outputs the corresponding HTML markup to the browser. It also obtains a reference to the next set of child nodes, via the getChildren() function, and calls itself with the new node collection as argument.
By using this recursive function, I've managed to substantially reduce the number of "if" conditional statements in my script; the code is now easier to read, and also structured more logically.