XML Parsing With SAX and Xerces (part 2)

The first part of this article demonstrated the basics of the Xerces XML parser, explaining how it could be used to process XML documents in a non-Web environment. This concluding section closes the circle, taking everything you’ve learned so far and demonstrating how it can be applied to create dynamic Web pages from static XML documents with Xerces.

In the first part of this article, I introduced you to the Xerces XML parser, explaining how it could be used to parse XML documents using an event-driven approach called SAX. I also demonstrated how the parser worked by using it in a couple of simple Java programs, and explained some of the interfaces and callbacks available in the API.

Now, writing a Java program to parse an XML document is all well and good. However, it’s not really all that useful if you’re a Web developer and your primary goal is the dynamic generation of Web pages from an XML file. And so, this concluding part takes everything you learned last time and tosses it out into the wild and wacky world of the Web, demonstrating clearly how Java, JSP, Xerces and XML can be combined to create simple, real-world Web applications. Take a look!{mospagebreak title=The Write Stuff} As in the first part of this article, we’ll begin with something simple.

Let’s go back to that XML file I created in the first part of this article:

<?xml version="1.0"?> <inventory> <item> <id>758</id> <name>Rusty, jagged nails for nailgun</name> <supplier>NailBarn, Inc.</supplier> <cost>2.99</cost> <quantity>10000</quantity> </item> <item> <id>6273</id> <name>Power pack for death ray</name> <supplier>QuakePower.domain.com</supplier> <cost>9.99</cost> <quantity>10</quantity> </item> </inventory>
Remember that event trail you saw in one of the very first examples? This next example ports that code to work with a Web server.

import org.apache.xerces.parsers.SAXParser; import org.xml.sax.*; import org.xml.sax.helpers.DefaultHandler; import java.io.*; public class MyFourthSaxApp extends DefaultHandler { private Writer out; // constructor public MyFourthSaxApp (String xmlFile, Writer out) { this.out = out; // create a Xerces SAX parser SAXParser parser = new SAXParser(); // set the content handler parser.setContentHandler(this); // parse the document try { parser.parse(xmlFile); out.flush(); } catch (SAXException e) { // something went wrong! } catch (IOException e) { // something went wrong! } } // call at document start public void startDocument() { try { out.write ("<h1>Document begins</h1><br>"); } catch (IOException e) { // do nothing } } // call at element start public void startElement (String uri, String local, String qName, Attributes atts) { try { out.write ("<h2>Element begins: "" + local + ""</h2>"); String AttributeName,AttributeType,AttributeValue = ""; for (int i = 0; i < atts.getLength(); i++) { AttributeName = atts.getLocalName(i); AttributeType = atts.getType(AttributeName); AttributeValue = atts.getValue(AttributeName); out.write ("<h3>Attribute: "" + AttributeName + ""<br>"); out.write ("&nbsp;&nbsp;&nbsp;Type: "" + AttributeType + ""<br>"); out.write ("&nbsp;&nbsp;&nbsp;Value: "" + AttributeValue + ""<br></h3>"); } } catch (IOException e) { // do nothing } } // call when cdata found public void characters(char[] text, int start, int length) { try { String Content = new String(text, start, length); if (!Content.trim().equals("")){ out.write("<h4>Character data: "" + Content + ""<br></h4>"); } } catch (IOException e) { // do nothing } } // call at element end public void endElement (String uri, String local, String qName){ try { out.write("<h2> Element ends: "" + local + ""<br></h2>"); } catch (IOException e) { // do nothing } } // call at document end public void endDocument() { try { out.write ("<h1>Document ends</h1><br>"); } catch (IOException e) { // do nothing } } }
I don’t want to get into the details of the callbacks here – refer to the explanation for the original example if there’s something that doesn’t seem to make sense – but I will point out some items of interest.

The most important difference between this example and the previous one is the introduction of a new Writer object, which makes it possible to stream output to the browser instead of the standard output device.

private Writer out;
The constructor also needs to be modified to accept two parameters: the name of the XML file, and a reference to the Writer object.

// constructor public MyFourthSaxApp (String xmlFile, Writer out) { this.out = out; // constructor code comes here<sum> }
This Writer object will be used to output HTML code to the browser, thereby enabling the dynamic generation of a Web page – as the following snippets demonstrates:

out.write("<h2>Element begins: "" + local + ""</h2>"); out.write ("<h1>Document ends</h1><br>");
Once this class has been compiled, it can easily be imported and used in a JSP document, thereby immediately making the application Web-friendly.

Here’s the code:

<%@ page language="java" import="java.io.IOException" %> <html> <head> </head> <body> <% try { MyFourthSaxApp myFourthExample = new MyFourthSaxApp("/www/xerces/WEB-INF/classes/inventory.xml",out); } catch (Exception e) { out.println("Something bad happened!"); } %> </body> </html>
The output of this example is the HTML equivalent of the output of the previous example. Here’s what it looks like:

{mospagebreak title=Nailing It To The Wall} Now, how about something a little more useful? Consider the following modification of the previous example:

<?xml version="1.0"?> <inventory> <item> <id>758</id> <name>Rusty, jagged nails for nailgun</name> <supplier>NailBarn, Inc.</supplier> <cost>2.99</cost> <quantity alert="500">10000</quantity> </item> <item> <id>6273</id> <name>Power pack for death ray</name> <supplier>QuakePower.domain.com</supplier> <cost currency="USD">9.99</cost> <quantity alert="20">10</quantity> </item> <item> <id>3784</id> <name>Axe</name> <supplier>Axe And You Shall Receive, Inc.</supplier> <cost currency="USD">56.74</cost> <quantity alert="5">25</quantity> </item> <item> <id>986</id> <name>NVGs</name> <supplier>Quake Eyewear</supplier> <cost currency="USD">1399.99</cost> <quantity alert="5">2</quantity> </item> </inventory>
Now, let’s suppose I want to display this information in a neatly-formatted table, with those items that I’m low on highlighted in red. My preferred output would look something like this:



Here’s the code to accomplish this:

import org.apache.xerces.parsers.SAXParser; import org.xml.sax.*; import org.xml.sax.helpers.DefaultHandler; import java.io.*; public class MyFifthSaxApp extends DefaultHandler { private Writer out; private String ElementName, AttributeName,AttributeValue = ""; private Integer Quantity, Alert; // constructor public MyFifthSaxApp (String xmlFile, Writer out) throws SAXException { this.out = out; // Create a Xerces SAX parser SAXParser parser = new SAXParser(); // Set the Content Handler parser.setContentHandler(this); // parse the Document try { parser.parse(xmlFile); out.flush(); } catch (IOException e) { throw new SAXException(e); } } // call this when opening element found public void startElement (String uri, String local, String qName, Attributes atts) throws SAXException { try { // this is useful later ElementName = local; // display table header if(local.equals("inventory")) { out.write("<h1><font face=Verdana>Inventory Management</font></h1>n<table width="55%" cellpadding="5" cellspacing="5" border="1"><tr><td><p align=right><b><font face=Verdana size=2>Code</font></b></p></td><td><b><font face=Verdana size=2>Name</font></b></td><td><b><font face=Verdana size=2>Supplier</font></b></td><td><p align=right><b><font face=Verdana size=2>Cost</font></b></p></td><td><p align=right><font face=Verdana size=2><b>Quantity</b></font></p></td></tr>"); } else if(local.equals("item")) { // "item" element starts a new row out.write("<tr>"); } else if( local.equals("name") || local.equals("supplier")) { // create table cells within row // align strings left, numbers right out.write("<td><p align=left><font face=Verdana size=2>"); } else if( local.equals("id") || local.equals("cost") || local.equals("quantity")) { out.write("<td><p align=right><font face=Verdana size=2>"); } else { out.write("<br>"); } for (int i = 0; i < atts.getLength(); i++) { AttributeName = atts.getLocalName(i); AttributeValue = atts.getValue(AttributeName); if(AttributeName.equals("currency")) { out.write(AttributeValue + "&nbsp;"); } else if(AttributeName.equals("alert")) { Alert = new Integer(AttributeValue); } else { out.write("&nbsp;"); } } } catch (IOException e) { throw new SAXException(e); } } // call this when cdata found public void characters(char[] text, int start, int length) throws SAXException { try { String Content = new String(text, start, length); if (!Content.trim().equals("")) { if ((ElementName != null && ElementName.equals("quantity")) && (AttributeName != null && AttributeName.equals("alert"))) { Quantity = new Integer(Content); // if quantity lower than expected, highlight in red if((Quantity.intValue()) < (Alert.intValue())) { out.write("<font color="#ff0000">" + Quantity + "</font>"); } else { out.write("<font color="#000000">" + Quantity + "</font>"); } } else { out.write(Content); } } } catch (IOException e) { throw new SAXException(e); } } // call this when closing element found public void endElement (String uri, String local, String qName) throws SAXException { try { if(local.equals("inventory")) { out.write("</table>"); } else if(local.equals("item")) { // "item" closes table row out.write("</tr>"); } else if(local.equals("id") || local.equals("name") || local.equals("supplier") || local.equals("cost") || local.equals("quantity")) { // close table cells out.write("</font></p></td>"); } else { out.write("&nbsp;"); } } catch (IOException e) { throw new SAXException(e); } } }
As you can see, the callback functions used here have evolved substantially from the previous examples – they now contain more conditional tests, and better error handling capabilities. Let’s take a closer look.

Most of the work in this script is done by the startElement() callback function. This function prints specific HTML output depending on the element encountered by the parser.

// call this when opening element found public void startElement (String uri, String local, String qName, Attributes atts) throws SAXException { try { // this is useful later ElementName = local; // display table header if(local.equals("inventory")) { out.write("<h1><font face=Verdana>Inventory Management</font></h1>n<table width="55%" cellpadding="5" cellspacing="5" border="1"><tr><td><p align=right><b><font face=Verdana size=2>Code</font></b></p></td><td><b><font face=Verdana size=2>Name</font></b></td><td><b><font face=Verdana size=2>Supplier</font></b></td><td><p align=right><b><font face=Verdana size=2>Cost</font></b></p></td><td><p align=right><font face=Verdana size=2><b>Quantity</b></font></p></td></tr>"); } else if(local.equals("item")) { // "item" element starts a new row out.write("<tr>"); } else if( local.equals("name") || local.equals("supplier")) { // create table cells within row // align strings left, numbers right out.write("<td><p align=left><font face=Verdana size=2>"); } else if( local.equals("id") || local.equals("cost") || local.equals("quantity")) { out.write("<td><p align=right><font face=Verdana size=2>"); } else { out.write("<br>"); } for (int i = 0; i < atts.getLength(); i++) { AttributeName = atts.getLocalName(i); AttributeValue = atts.getValue(AttributeName); if(AttributeName.equals("currency")) { out.write(AttributeValue + "&nbsp;"); } else if(AttributeName.equals("alert")) { Alert = new Integer(AttributeValue); } else { out.write("&nbsp;"); } } } catch (IOException e) { throw new SAXException(e); } }
This function maps different XML elements to appropriate HTML markup. As you can see, the document element “inventory”, which marks the start of the XML document, is used to create the skeleton and first row of an HTML table, while the different “item” elements correspond to rows within this table. The details of each item – name, supplier, quantity et al – are formatted as cells within each row of the table.

Next, the characters() callback function handles formatting of the content embedded within the elements.

// call this when cdata found public void characters(char[] text, int start, int length) throws SAXException { try { String Content = new String(text, start, length); if (!Content.trim().equals("")) { if ((ElementName != null && ElementName.equals("quantity")) && (AttributeName != null && AttributeName.equals("alert"))) { Quantity = new Integer(Content); // if quantity lower than expected, highlight in red if((Quantity.intValue()) < (Alert.intValue())) { out.write("<font color="#ff0000">" + Quantity + "</font>"); } else { out.write("<font color="#000000">" + Quantity + "</font>"); } } else { out.write(Content); } } } catch (IOException e) { throw new SAXException(e); } }
For most of the elements, I’m simply displaying the content as is. The only deviation from this standard policy occurs with the “quantity” element, which has an additional “alert” attribute. This “alert” attribute specifies the minimum number of units that should be in stock of the corresponding item; if the quantity drops below this minimum level, an alert should be generated. Consequently, the characters() callback includes some code to test the current quantity against the minimum quantity, and highlight the data in red if the test fails.

And finally, to wrap things up, the endElement() callback closes the HTML tags opened earlier.

// call this when closing element found public void endElement (String uri, String local, String qName) throws SAXException { try { if(local.equals("inventory")) { out.write("</table>"); } else if(local.equals("item")) { // "item" closes table row out.write("</tr>"); } else if(local.equals("id") || local.equals("name") || local.equals("supplier") || local.equals("cost") || local.equals("quantity")) { // close table cells out.write("</font></p></td>"); } else { out.write("&nbsp;"); } } catch (IOException e) { throw new SAXException(e); } }
Once you’ve compiled this class, you can use it in a JSP page, as you did with the previous example. Here’s the code,

<%@ page language="java" import="java.io.IOException" %> <html> <head> </head> <body> <% try { MyFifthSaxApp myFifthExample = new MyFifthSaxApp("/www/xerces/WEB-INF/classes/inventory.xml",out); } catch (Exception e) { out.println("<font face="verdana" size="2">The following error occurred: <br><b>" + e + "</b></font>"); } %> </body> </html>
and here’s the output:

{mospagebreak title=When Things Go Wrong} If you take a close look at the previous example, you’ll notice some fairly complex error-handling built into it. It’s instructive to examine that, and understand the reason for its inclusion.

You’ll remember that I defined a Writer object at the top of my program; this Writer object provides a convenient way to output a character stream, either to a file or elsewhere. However, if the object does not initialize correctly, there is no way of communicating the error to the final JSP page.

The solution to the problem is simple: throw an exception. This exception can be captured by the JSP page and resolved appropriately.

Let’s take another look at the startElement() callback, this time focusing on the error-handling built into it:

// call this when opening element found public void startElement (String uri, String local, String qName, Attributes atts) throws SAXException { try { // snip } catch (IOException e) { throw new SAXException(e); } }
By default, the startElement() callback is not set up to throw any exception. However, it’s possible to alter this default behaviour and set it up to throw a SAXException if an error occurs with the Writer object, and propagate this error to the target JSP document.

Why is this necessary? Because if you don’t do this, and your Writer object throws an error, there’s no way of letting the JSP document know what happened, simply because the Writer object is the only available line of communication between the Java class and the JSP document. It’s a little like that chicken-and-egg situation we all know and love…

Now, in the JSP page, it’s possible to set up a basic error resolution mechanism to display the error on the screen. In order to test-drive it, try removing one of the opening “item” tags from the XML document used in this example and accessing the JSP page again through your browser.{mospagebreak title=Skinning A Cat, Technique Two} How about another example, this one utilizing a different technique to format XML into HTML?

Here’s the XML file I plan to use – it’s a simple to-do list, with tasks, priorities and due dates marked up in XML.

<?xml version="1.0"?> <todo> <item> <priority>1</priority> <task>Figure out how Xerces works</task> <due>2001-12-12</due> </item> <item> <priority>2</priority> <task>Conquer the last Quake map</task> <due>2001-12-31</due> </item> <item> <priority>3</priority> <task>Buy a Ferrari</task> <due>2005-12-31</due> </item> <item> <priority>1</priority> <task>File tax return</task> <due>2002-03-31</due> </item> <item> <priority>3</priority> <task>Learn to cook</task> <due>2002-06-30</due> </item> </todo>
As with the previous examples, this has two components: the source code for the Java class, and the JSP page which uses the class. Here’s the class:

import java.util.*; import java.io.*; import org.xml.sax.*; import org.apache.xerces.parsers.SAXParser; import org.xml.sax.helpers.DefaultHandler; public class MySixthSaxApp extends DefaultHandler { private Writer out; private String ElementName = ""; // define a hash table to store HTML markup // this hash table is used in the callback functions // for start, end and character elements ("priority" only) private Map StartElementHTML = new HashMap(); private Map EndElementHTML = new HashMap(); private Map PriorityHTML = new HashMap(); // constructor public MySixthSaxApp (String xmlFile, Writer out) throws SAXException { this.out = out; // initialize StartElementHTML Hashmap StartElementHTML.put("todo","<ol>n"); StartElementHTML.put("item","<li>"); StartElementHTML.put("task","<b>"); StartElementHTML.put("due","&nbsp;<i>("); // initialize EndElementHTML Hashmap EndElementHTML.put("todo","</ol>n"); EndElementHTML.put("item","</font></li>n"); EndElementHTML.put("task","</b>"); EndElementHTML.put("due",")</i>"); // initialize PriorityHTML Hashmap PriorityHTML.put("1","<font face="Verdana" color="#ff0000" size="2">"); PriorityHTML.put("2","<font face="Verdana" color="#0000ff" size="2">"); PriorityHTML.put("3","<font face="Verdana" color="#000000" size="2">"); // create a Xerces SAX parser SAXParser parser = new SAXParser(); // set the content handler parser.setContentHandler(this); // parse the document try { parser.parse(xmlFile); out.flush(); } catch (IOException e) { throw new SAXException(e); } } // start element callback function public void startElement (String uri, String local, String qName, Attributes atts) throws SAXException { try { // keep track of the element being parsed ElementName = local; // only call the HashMap table if the element is not the "priority" element if(local != null && (!local.equals("priority"))) { // this ensures that elements not present // in the HashMap are handled // basically, taking care of those ugly NullPointerExceptions if(StartElementHTML.get(local) != null) { out.write((StartElementHTML.get(local)).toString()); } } } catch (IOException e) { throw new SAXException(e); } } // cdata callback function public void characters(char[] text, int start, int length) throws SAXException { try { String Content = new String(text, start, length); if (!Content.trim().equals("")) { if(ElementName != null) { // if the element name is not "priority", then display content if(!ElementName.equals("priority")) { out.write(Content); } else { // if it is the "priority" element // get the HTML tag from the Priority Hashmap // this defines the color for tasks with different priorities if(PriorityHTML.get(Content) != null) { out.write((PriorityHTML.get(Content)).toString()); } } } } } catch (IOException e) { throw new SAXException(e); } } // end element callback function public void endElement (String uri, String local, String qName) throws SAXException { try { if(local != null && (!local.equals("priority"))) { if(EndElementHTML.get(local) != null) { out.write((EndElementHTML.get(local)).toString()); ElementName = null; } } } catch (IOException e) { throw new SAXException(e); } } }
This is much cleaner and easier to read than the previous example, since it uses Java’s HashMap object to store key-value pairs mapping HTML markup to XML markup. Three HashMaps have been used here: StartElementHTML, which stores the HTML tags for opening XML elements; EndElementHTML, which stores the HTML tags for closing XML elements; and PriorityHTML, which stores the HTML tags for the “priority” elements defined for each “item”.

These HashMaps are populated with data in the class constructor:

// initialize StartElementHTML Hashmap StartElementHTML.put("todo","<ol>n"); StartElementHTML.put("item","<li>"); StartElementHTML.put("task","<b>"); StartElementHTML.put("due","&nbsp;<i>("); // initialize EndElementHTML Hashmap EndElementHTML.put("todo","</ol>n"); EndElementHTML.put("item","</font></li>n"); EndElementHTML.put("task","</b>"); EndElementHTML.put("due",")</i>"); // initialize PriorityHTML Hashmap PriorityHTML.put("1","<font face="Verdana" color="#ff0000" size="2">"); PriorityHTML.put("2","<font face="Verdana" color="#0000ff" size="2">"); PriorityHTML.put("3","<font face="Verdana" color="#000000" size="2">");
A string variable named ElementName is also used to store the name of the element currently being parsed; this is used within the characters() callback function.

private String ElementName = "";
Now, when an opening tag is found, the startElement() callback is triggered; this callback function uses the current element name as a key into the HashMap previously defined, retrieves the corresponding HTML markup for that element, and prints it.

// start element callback function public void startElement (String uri, String local, String qName, Attributes atts) throws SAXException { try { // keep track of the element being parsed ElementName = local; // only call the HashMap table if the element is not the "priority" element if(local != null && (!local.equals("priority"))) { // this ensures that elements not present // in the HashMap are handled // basically, taking care of those ugly NullPointerExceptions if(StartElementHTML.get(local) != null) { out.write((StartElementHTML.get(local)).toString()); } } } catch (IOException e) { throw new SAXException(e); } }
Note the numerous checks to avoid NullPointerExceptions, the bane of every Java programmer on the planet.

With the opening element handled, the next step is to process the character data that follows it. This is handled by the characters() callback, which performs the important task of displaying the element content, with appropriate modification to the font colour depending on the element priority.

// cdata callback function public void characters(char[] text, int start, int length) throws SAXException { try { String Content = new String(text, start, length); if (!Content.trim().equals("")) { if(ElementName != null) { // if the element name is not "priority", then display content if(!ElementName.equals("priority")) { out.write(Content); } else { // if it is the "priority" element // get the HTML tag from the Priority Hashmap // this defines the color for tasks with different priorities if(PriorityHTML.get(Content) != null) { out.write((PriorityHTML.get(Content)).toString()); } } } } } catch (IOException e) { throw new SAXException(e); } }
Here, the priority of the task is used to retrieve the corresponding display colour for from the PriorityHTML HashMap, and the content is then printed in that colour.

Finally, the endElement() callback function replicates the functionality of the startElement() callback, closing the HTML tags opened earlier.

// end element callback function public void endElement (String uri, String local, String qName) throws SAXException { try { if(local != null && (!local.equals("priority"))) { if(EndElementHTML.get(local) != null) { out.write((EndElementHTML.get(local)).toString()); ElementName = null; } } } catch (IOException e) { throw new SAXException(e); } } }
And here’s the JSP page that uses the class above:

<%@ page language="java" import="java.io.IOException" %> <html> <head> </head> <body> <h1><font face="Verdana">My Todo List</font></h1> <% try { MySixthSaxApp mySixthExample = new MySixthSaxApp("/www/xerces/WEB-INF/classes/todo.xml ",out); } catch (Exception e) { out.println("<font face="verdana" size="2">Something bad just happened: <br><b>" + e + "</b></font>"); } %> </body> </html>
And here’s what it looks like:



Because I’ve used HashMaps to map XML elements to HTML markup, the code in the example above is cleaner and easier to maintain. Further, this approach makes it simpler to edit the XML-to-HTML mapping; if I need to add a new element to the source XML document, I need only update the HashMaps in my class code, with minimal modification to the callbacks themselves.{mospagebreak title=Endnote} That’s about it for this article. Over the preceding pages, you learned more than you ever wanted to know about the Xerces SAX parser, using it to develop simple XML-based applications in both Web and non-Web environments. You (hopefully) understood how SAX works, gained an insight into what callback functions do, and learned how to use Xerces’ interfaces in combination with simple Java constructs to quickly and easily create dynamic Web pages from static XML documents.

I hope you enjoyed it, and that it helped you to gain a greater understanding of how to process XML and use it in a Java-based environment – both on and off the Web. In case you’d like more information on the topic, you should consider bookmarking the following sites:

The official Xerces Web page, at http://xml.apache.org/xerces-j/

The Xerces FAQ, at http://xml.apache.org/xerces-j/faq-write.html

The SAX project, at http://www.saxproject.org/

The Xerces-Java Quick Start, at http://www.ecerami.com/xerces/

See you soon!

Note: All examples in this article have been tested with JDK 1.3.0, Apache 1.3.11, mod_jk 1.1.0, Xerces 1.4.4 and Tomcat 3.3. Examples are illustrative only, and are not meant for a production environment. YMMV!
[gp-comments width="770" linklove="off" ]
antalya escort bayan antalya escort bayan