Perl
  Home arrow Perl arrow Page 5 - Using Perl with XML (part 1)
Dev Shed Forums  
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Smartphone Development  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Mobile Linux  
App Generation ROI  
IBM® developerWorks  
Forums Sitemap  
E-Commerce Hosting  
Linux Web Hosting  
Managed Hosting  
Small Business Hosting  
VPS Hosting  
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid  
Request Media Kit
Contact Us  
Site Map  
Privacy Policy  
Support  
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
PERL

Using Perl with XML (part 1)
By: icarus, (c) Melonfire
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: starstarstarstarstar / 18
    2002-01-15


    Table of Contents:
  • Using Perl with XML (part 1)
  • Getting Down To Business
  • Let's Talk About SAX
  • Breaking It Down
  • Call Me Back
  • Random Walk
  • What's For Dinner?

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      error-file:tidyout.log Del.ici.ous error-file:tidyout.log Digg
      error-file:tidyout.log Blink error-file:tidyout.log Simpy
      error-file:tidyout.log Google error-file:tidyout.log Spurl
      error-file:tidyout.log Y! MyWeb error-file:tidyout.log Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article

     
     
    ADVERTISEMENT


    Using Perl with XML (part 1) - Call Me Back
    ( Page 5 of 7 )

    As I've just explained, the start(), end() and cdata() functions will be called by the parser as it progresses through the document. We haven't defined these yet - let's do that next:

    # keep track of which tag is currently being processed $currentTag = ""; # this is called when a start tag is found sub start() { # extract variables my ($parser, $name, %attr) = @_; $currentTag = lc($name); if ($currentTag eq "book") { print "<tr>"; } elsif ($currentTag eq "title") { print "<td>"; } elsif ($currentTag eq "author") { print "<td>"; } elsif ($currentTag eq "price") { print "<td>"; } elsif ($currentTag eq "rating") { print "<td>"; } }
    Each time the parser encounters a starting tag, it calls start() with the name of the tag (and attributes, if any) as arguments. The start() function then processes the tag, printing corresponding HTML markup in place of the XML tag.

    I've used an "if" statement, keyed on the tag name, to decide how to process each tag. For example, since I know that <book> indicates the beginning of a new row in my desired output, I replace it with a <tr>, while other elements like <title> and <author> correspond to table cells, and are replaced with <td> tags.

    In case you're wondering, I've used the lc() function to convert the tag name to lowercase before performing the comparison; this is necessary to enforce consistency and to ensure that the script works with XML documents that use upper-case or mixed-case tags.

    Finally, I've also stored the current tag name in the global variable $currentTag - this can be used to identify which tag is being processed at any stage, and it'll come in useful a little further down.

    The end() function takes care of closing tags, and looks similar to start() - note that I've specifically cleaned up $currentTag at the end.

    # this is called when an end tag is found sub end() { my ($parser, $name) = @_; $currentTag = lc($name); if ($currentTag eq "book") { print "</tr>"; } elsif ($currentTag eq "title") { print "</td>"; } elsif ($currentTag eq "author") { print "</td>"; } elsif ($currentTag eq "price") { print "</td>"; } elsif ($currentTag eq "rating") { print "</td>"; } # clear value of current tag $currentTag = ""; }
    Note that empty elements generate both start and end events.

    So this takes care of replacing XML tags with corresponding HTML tags...but what about handling the data between them?

    # this is called when CDATA is found sub cdata() { my ($parser, $data) = @_; my @ratings = ("Words fail me!", "Terrible", "Bad", "Indifferent", "Good", "Excellent"); if ($currentTag eq "title") { print "<i>$data</i>"; } elsif ($currentTag eq "author") { print $data; } elsif ($currentTag eq "price") { print "\$$data"; } elsif ($currentTag eq "rating") { print $ratings[$data]; } }
    The cdata() function is called whenever the parser encounters data between an XML tag pair. Note, however, that the function is only passed the data as argument; there is no way of telling which tags are around it. However, since the parser processes XML chunk-by-chunk, we can use the $currentTag variable to identify which tag this data belongs to.

    Depending on the value of $currentTag, an "if" statement is used to print data with appropriate formatting; this is the place where I add italics to the title, a currency symbol to the price, and a text rating (corresponding to a numerical index) from the @ratings array.

    Here's what the finished script (with some additional HTML, so that you can use it via CGI) looks like:

    #!/usr/bin/perl # include package use XML::Parser; # initialize parser $xp = new XML::Parser(); # set callback functions $xp->setHandlers(Start => \&start, End => \&end, Char => \&cdata); # keep track of which tag is currently being processed $currentTag = ""; # send standard header to browser print "Content-Type: text/html\n\n"; # set up HTML page print "<html><head></head><body>"; print "<h2>The Library</h2>"; print "<table border=1 cellspacing=1 cellpadding=5>"; print "<tr><td align=center>Title</td><td align=center>Author</td><td align=center>Price</td><td align=center>User Rating</td></tr>"; # parse XML $xp->parsefile("library.xml"); print "</table></body></html>"; # this is called when a start tag is found sub start() { # extract variables my ($parser, $name, %attr) = @_; $currentTag = lc($name); if ($currentTag eq "book") { print "<tr>"; } elsif ($currentTag eq "title") { print "<td>"; } elsif ($currentTag eq "author") { print "<td>"; } elsif ($currentTag eq "price") { print "<td>"; } elsif ($currentTag eq "rating") { print "<td>"; } } # this is called when CDATA is found sub cdata() { my ($parser, $data) = @_; my @ratings = ("Words fail me!", "Terrible", "Bad", "Indifferent", "Good", "Excellent"); if ($currentTag eq "title") { print "<i>$data</i>"; } elsif ($currentTag eq "author") { print $data; } elsif ($currentTag eq "price") { print "\$$data"; } elsif ($currentTag eq "rating") { print $ratings[$data]; } } # this is called when an end tag is found sub end() { my ($parser, $name) = @_; $currentTag = lc($name); if ($currentTag eq "book") { print "</tr>"; } elsif ($currentTag eq "title") { print "</td>"; } elsif ($currentTag eq "author") { print "</td>"; } elsif ($currentTag eq "price") { print "</td>"; } elsif ($currentTag eq "rating") { print "</td>"; } # clear value of current tag $currentTag = ""; } # end
    And when you run it, here's what you'll see:



    You can now add new items to your XML document, or edit existing items, and your rendered HTML page will change accordingly. By separating the data from the presentation, XML has imposed standards on data collections, making it possible, for example, for users with no technical knowledge of HTML to easily update content on a Web site, or to present data from a single source in different ways.

     
     
    >>> More Perl Articles          >>> More By icarus, (c) Melonfire
     

       

    PERL ARTICLES

    - More Perl Bits
    - Perl, Bit by Bit
    - Basic Charting with Perl
    - Using Getopt::Long: More Command Line Option...
    - Command Line Options in Perl: Using Getopt::...
    - Web Access with LWP
    - More Templating Tools for Perl
    - Site Layout with Perl Templating Tools
    - Build a Perl RSS Aggregator with Templating ...
    - Looping, Security, and Templating Tools
    - Perl: Bon Voyage Lists and Hashes
    - Templating Tools
    - Perl: Number Crunching
    - Perl Debuggers in Detail
    - Debugging Perl





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 4 Hosted by Hostway
    Stay green...Green IT