Perl
  Home arrow Perl arrow Page 5 - Using Perl with XML (part 1)
Dev Shed Forums 
Administration  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Forums Sitemap 
IBM® developerWorks 
Dedicated Servers 
E-Commerce Hosting 
Linux Web Hosting 
Managed Hosting 
Small Business Hosting 
Download TestComplete 
VPS Hosting 
Weekly Newsletter

 
Developer Updates  
Free Website Content 
IBM Developerworks
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
PERL

Using Perl with XML (part 1)
By: icarus, (c) Melonfire
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 5 stars5 stars5 stars5 stars5 stars / 14
    2002-01-15

    Table of Contents:
  • Using Perl with XML (part 1)
  • Getting Down To Business
  • Let's Talk About SAX
  • Breaking It Down
  • Call Me Back
  • Random Walk
  • What's For Dinner?

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
     
    ADVERTISEMENT

    Route your faxes to your email inbox. Private, secure fax numbers available from CallWave. Choose your fax number.

    Using Perl with XML (part 1) - Call Me Back
    (Page 5 of 7 )

    As I've just explained, the start(), end() and cdata() functions will be called by the parser as it progresses through the document. We haven't defined these yet - let's do that next:

    # keep track of which tag is currently being processed $currentTag = ""; # this is called when a start tag is found sub start() { # extract variables my ($parser, $name, %attr) = @_; $currentTag = lc($name); if ($currentTag eq "book") { print "<tr>"; } elsif ($currentTag eq "title") { print "<td>"; } elsif ($currentTag eq "author") { print "<td>"; } elsif ($currentTag eq "price") { print "<td>"; } elsif ($currentTag eq "rating") { print "<td>"; } }
    Each time the parser encounters a starting tag, it calls start() with the name of the tag (and attributes, if any) as arguments. The start() function then processes the tag, printing corresponding HTML markup in place of the XML tag.

    I've used an "if" statement, keyed on the tag name, to decide how to process each tag. For example, since I know that <book> indicates the beginning of a new row in my desired output, I replace it with a <tr>, while other elements like <title> and <author> correspond to table cells, and are replaced with <td> tags.

    In case you're wondering, I've used the lc() function to convert the tag name to lowercase before performing the comparison; this is necessary to enforce consistency and to ensure that the script works with XML documents that use upper-case or mixed-case tags.

    Finally, I've also stored the current tag name in the global variable $currentTag - this can be used to identify which tag is being processed at any stage, and it'll come in useful a little further down.

    The end() function takes care of closing tags, and looks similar to start() - note that I've specifically cleaned up $currentTag at the end.

    # this is called when an end tag is found sub end() { my ($parser, $name) = @_; $currentTag = lc($name); if ($currentTag eq "book") { print "</tr>"; } elsif ($currentTag eq "title") { print "</td>"; } elsif ($currentTag eq "author") { print "</td>"; } elsif ($currentTag eq "price") { print "</td>"; } elsif ($currentTag eq "rating") { print "</td>"; } # clear value of current tag $currentTag = ""; }
    Note that empty elements generate both start and end events.

    So this takes care of replacing XML tags with corresponding HTML tags...but what about handling the data between them?

    # this is called when CDATA is found sub cdata() { my ($parser, $data) = @_; my @ratings = ("Words fail me!", "Terrible", "Bad", "Indifferent", "Good", "Excellent"); if ($currentTag eq "title") { print "<i>$data</i>"; } elsif ($currentTag eq "author") { print $data; } elsif ($currentTag eq "price") { print "\$$data"; } elsif ($currentTag eq "rating") { print $ratings[$data]; } }
    The cdata() function is called whenever the parser encounters data between an XML tag pair. Note, however, that the function is only passed the data as argument; there is no way of telling which tags are around it. However, since the parser processes XML chunk-by-chunk, we can use the $currentTag variable to identify which tag this data belongs to.

    Depending on the value of $currentTag, an "if" statement is used to print data with appropriate formatting; this is the place where I add italics to the title, a currency symbol to the price, and a text rating (corresponding to a numerical index) from the @ratings array.

    Here's what the finished script (with some additional HTML, so that you can use it via CGI) looks like:

    #!/usr/bin/perl # include package use XML::Parser; # initialize parser $xp = new XML::Parser(); # set callback functions $xp->setHandlers(Start => \&start, End => \&end, Char => \&cdata); # keep track of which tag is currently being processed $currentTag = ""; # send standard header to browser print "Content-Type: text/html\n\n"; # set up HTML page print "<html><head></head><body>"; print "<h2>The Library</h2>"; print "<table border=1 cellspacing=1 cellpadding=5>"; print "<tr><td align=center>Title</td><td align=center>Author</td><td align=center>Price</td><td align=center>User Rating</td></tr>"; # parse XML $xp->parsefile("library.xml"); print "</table></body></html>"; # this is called when a start tag is found sub start() { # extract variables my ($parser, $name, %attr) = @_; $currentTag = lc($name); if ($currentTag eq "book") { print "<tr>"; } elsif ($currentTag eq "title") { print "<td>"; } elsif ($currentTag eq "author") { print "<td>"; } elsif ($currentTag eq "price") { print "<td>"; } elsif ($currentTag eq "rating") { print "<td>"; } } # this is called when CDATA is found sub cdata() { my ($parser, $data) = @_; my @ratings = ("Words fail me!", "Terrible", "Bad", "Indifferent", "Good", "Excellent"); if ($currentTag eq "title") { print "<i>$data</i>"; } elsif ($currentTag eq "author") { print $data; } elsif ($currentTag eq "price") { print "\$$data"; } elsif ($currentTag eq "rating") { print $ratings[$data]; } } # this is called when an end tag is found sub end() { my ($parser, $name) = @_; $currentTag = lc($name); if ($currentTag eq "book") { print "</tr>"; } elsif ($currentTag eq "title") { print "</td>"; } elsif ($currentTag eq "author") { print "</td>"; } elsif ($currentTag eq "price") { print "</td>"; } elsif ($currentTag eq "rating") { print "</td>"; } # clear value of current tag $currentTag = ""; } # end
    And when you run it, here's what you'll see:



    You can now add new items to your XML document, or edit existing items, and your rendered HTML page will change accordingly. By separating the data from the presentation, XML has imposed standards on data collections, making it possible, for example, for users with no technical knowledge of HTML to easily update content on a Web site, or to present data from a single source in different ways.

    More Perl Articles
    More By icarus, (c) Melonfire


     

       

    PERL ARTICLES

    - Perl: A Continuing Look at Hashes and Multid...
    - Perl: Another Round with Hashes
    - Perl Hashes
    - Perl Lists: A Final Look at List::Util
    - Perl Lists: Utilizing List::Util
    - Perl Lists: The Split() Function
    - SQL and CGI with Perl and DBI
    - Perl Lists: More Functions and Operators
    - SELECT Queries and Perl
    - Perl Lists: More on Manipulation
    - Creating a Database with Perl and DBI
    - Perl: Sailing the List(less) Seas
    - Perl and DBI
    - Perl: Concatenating Text and More
    - Perl Text: Quoting Without Quote Marks

     
    Accelerating Trading Partner Performance
     
    Competing on Analytics
     
    Cost Effective Scaling with Virtualization and Coyote Point Systems
     
    Five Checkpoints to Implementing IP Telephony
     
    Hosted Email Security: Staying Ahead of New Threats
     




    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 6 hosted by Hostway