XML
  Home arrow XML arrow Page 6 - XML Parsing With SAX and Xerces (part ...
Dev Shed Forums 
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Forums Sitemap 
IBM® developerWorks 
Sun Developer Network 
Dedicated Servers 
E-Commerce Hosting 
Linux Web Hosting 
Managed Hosting 
Small Business Hosting 
Moblin 
JMSL Numerical Library 
VPS Hosting 
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
XML

XML Parsing With SAX and Xerces (part 1)
By: icarus, (c) Melonfire
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 5 stars5 stars5 stars5 stars5 stars / 13
    2002-01-28

    Table of Contents:
  • XML Parsing With SAX and Xerces (part 1)
  • Playing The SAX
  • Reaching For The Nailgun
  • Under The Microscope
  • Sweeping Up The Mess
  • Diving Deeper

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    XML Parsing With SAX and Xerces (part 1) - Diving Deeper


    (Page 6 of 6 )

    This next example goes beyond the simple applications you've just seen to provide a more comprehensive XML parsing and processing demonstration. Here's the XML file I plan to use:

    <?xml version="1.0"?> <inventory> <item> <id>758</id> <name>Rusty, jagged nails for nailgun</name> <supplier>NailBarn, Inc.</supplier> <cost currency="USD">2.99</cost> <quantity alert="500">10000</quantity> </item> <item> <id>6273</id> <name>Power pack for death ray</name> <supplier>QuakePower.domain.com</supplier> <cost currency="USD">9.99</cost> <quantity alert="20">10</quantity> </item> </inventory>
    Now, how about parsing this XML file and displaying a breakup of the data contained within it? With SAX, it's a snap!

    import org.apache.xerces.parsers.SAXParser; import org.xml.sax.*; import org.xml.sax.helpers.DefaultHandler; import java.io.*; public class MyThirdSaxApp extends DefaultHandler { // constructor public MyThirdSaxApp (String xmlFile){ // create a Xerces SAX parser SAXParser parser = new SAXParser(); // set the content handler parser.setContentHandler(this); // parse the document try{ parser.parse(xmlFile); } catch (SAXException e) { System.err.println (e); } catch (IOException e) { System.err.println (e); } } // callback definitions start here // call this at document start public void startDocument() { System.out.println ("Document begins"); } // call this when start tag found public void startElement (String uri, String local, String qName, Attributes atts){ System.out.println ("Element begins: "" + local + """); String AttributeName,AttributeType,AttributeValue = ""; for (int i = 0; i < atts.getLength(); i++) { AttributeName = atts.getLocalName(i); AttributeType = atts.getType(AttributeName); AttributeValue = atts.getValue(AttributeName); System.out.println ("Attribute: "" + AttributeName + """); System.out.println ("tType: "" + AttributeType + """); System.out.println ("tValue: "" + AttributeValue + """); } } // call this when CDATA found public void characters(char[] text, int start, int length){ String Content = new String(text, start, length); if (!Content.trim().equals("")){ System.out.println("Character data: "" + Content + """); } } // call this when end tag found public void endElement (String uri, String local, String qName){ System.out.println("Element ends: "" + local + """); } // call this at document end public void endDocument(){ System.out.println ("Document ends"); } // the main method public static void main (String[] args) { MyThirdSaxApp myThirdExample = new MyThirdSaxApp(args[0]); } }
    Here's the output:

    Document begins Element begins: "inventory" Element begins: "item" Element begins: "id" Character data: "758" Element ends: "id" Element begins: "name" Character data: "Rusty, jagged nails for nailgun" Element ends: "name" Element begins: "supplier" Character data: "NailBarn, Inc." Element ends: "supplier" Element begins: "cost" Attribute: "currency" Type: "CDATA" Value: "USD" Character data: "2.99" Element ends: "cost" Element begins: "quantity" Attribute: "alert" Type: "CDATA" Value: "500" Character data: "10000" Element ends: "quantity" Element ends: "item" Element begins: "item" Element begins: "id" Character data: "6273" Element ends: "id" Element begins: "name" Character data: "Power pack for death ray" Element ends: "name" Element begins: "supplier" Character data: "QuakePower.domain.com" Element ends: "supplier" Element begins: "cost" Attribute: "currency" Type: "CDATA" Value: "USD" Character data: "9.99" Element ends: "cost" Element begins: "quantity" Attribute: "alert" Type: "CDATA" Value: "20" Character data: "10" Element ends: "quantity" Element ends: "item" Element ends: "inventory" Document ends
    Most of this should be familiar to you by now, so I'm going to concentrate on the callback functions used in the example above:

    First up, the startDocument() callback, invoked when the parser encounters the beginning of an XML document. Here, the function merely prints a string indicating the start of the document; you could also use it to print a header, or initialize document-specific variables.

    // call this at document start public void startDocument() { System.out.println ("Document begins"); }
    Next, it's the turn of the startElement() callback, discussed in detail a few pages back...although this one adds a new wrinkle by also accounting for element attributes.

    public void startElement (String uri, String local, String qName, Attributes atts) { System.out.println ("Element begins: "" + local + """); String AttributeName,AttributeType,AttributeValue = ""; for (int i = 0; i < atts.getLength(); i++) { AttributeName = atts.getLocalName(i); AttributeType = atts.getType(AttributeName); AttributeValue = atts.getValue(AttributeName); System.out.println ("Attribute: "" + AttributeName + """); System.out.println ("tType: "" + AttributeType + """); System.out.println ("tValue: "" + AttributeValue + """); } }
    Note that attributes attached to the element are automatically passed to the startElement() callback as an array. Detailed information on each attribute in this array can be obtained via the functions getName(), getType() and getValue().

    The characters() callback handles character data, and receives the CDATA string as argument:

    public void characters(char[] text, int start, int length) { String Content = new String(text, start, length); if (!Content.trim().equals("")) { System.out.println("Character data: "" + Content + """); } }
    Sadly, this information is passed as an array of individual characters, rather than a single string. This means lots of extra processing to get the information into a usable format - which accounts for much of the code above.

    It's important to note that the parser will also invoke the characters() callback when it encounters whitespace within the XML document. As you might imagine, this can lead to strange results, especially if you're new to XML programming. I've used the trim() string function to spare myself the agony - you should do the same.

    The endElement()callback is invoked when the parser hits the end of an element - note that this callback receives the ending element name as argument.

    public void endElement (String uri, String local, String qName){ System.out.println("Element ends: "" + local + """); }
    Finally, the endDocument() callback is triggered when the end of the document is reached.

    public void endElement (String uri, String local, String qName){ System.out.println("Element ends: "" + local + """); }
    All these callbacks, acting in concert, result in the output described a few paragraphs back.

    Obviously, this is just one illustration of the applications of the Xerces SAX parser. You can do a lot more with it...and in the second part of this article, I'll build on everything you just learnt to demonstrate how the Xerces SAX parser can be combined with JSP to format XML documents for a Web browser. I'll also take a look at the error-handling functions built into the parser, demonstrating how they can be used to trap and catch errors in XML processing. Make sure you come back for that one!

    Note: All examples in this article have been tested with JDK 1.3.0, Apache 1.3.11, mod_jk 1.1.0, Xerces 1.4.4 and Tomcat 3.3. Examples are illustrative only, and are not meant for a production environment. YMMV!
    DISCLAIMER: The content provided in this article is not warranted or guaranteed by Developer Shed, Inc. The content provided is intended for entertainment and/or educational purposes in order to introduce to the reader key ideas, concepts, and/or product reviews. As such it is incumbent upon the reader to employ real-world tactics for security and implementation of best practices. We are not liable for any negative consequences that may result from implementing any information covered in our articles or tutorials. If this is a hardware review, it is not recommended to open and/or modify your hardware.

     

       

    XML ARTICLES

    - How to Set Up Podcasting and Vodcasting
    - Creating an RSS Reader Application
    - Building an RSS File
    - An Introduction to XUL Part 6
    - An Introduction to XUL Part 5
    - An Introduction to XUL Part 4
    - An Introduction to XUL Part 3
    - An Introduction to XUL Part 2
    - An Introduction to XUL Part 1
    - XML Matters: Practical XML Data Design and M...
    - Practical XML Data Design and Manipulation f...
    - SimpleXML
    - XForms Basics, Part 3
    - XForms Basics, Part 2
    - XForms Basics





    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 6 hosted by Hostway