HomeXML Page 4 - XML Parsing With SAX and Xerces (part 2)
When Things Go Wrong - XML
The first part of this article demonstrated the basics of the Xerces XML parser, explaining how it could be used to process XML documents in a non-Web environment. This concluding section closes the circle, taking everything you've learned so far and demonstrating how it can be applied to create dynamic Web pages from static XML documents with Xerces.
If you take a close look at the previous example, you'll notice some fairly complex error-handling built into it. It's instructive to examine that, and understand the reason for its inclusion.
You'll remember that I defined a Writer object at the top of my program; this Writer object provides a convenient way to output a character stream, either to a file or elsewhere. However, if the object does not initialize correctly, there is no way of communicating the error to the final JSP page.
The solution to the problem is simple: throw an exception. This exception can be captured by the JSP page and resolved appropriately.
Let's take another look at the startElement() callback, this time focusing on the error-handling built into it:
// call this when opening element found
public void startElement (String
uri, String local, String qName,
Attributes atts)
throws SAXException {
try
{
// snip
} catch (IOException e) {
throw new SAXException(e);
}
}
By default, the startElement() callback is not set up to throw any exception.
However, it's possible to alter this default behaviour and set it up to throw a SAXException if an error occurs with the Writer object, and propagate this error to the target JSP document.
Why is this necessary? Because if you don't do this, and your Writer object throws an error, there's no way of letting the JSP document know what happened, simply because the Writer object is the only available line of communication between the Java class and the JSP document. It's a little like that chicken-and-egg situation we all know and love...
Now, in the JSP page, it's possible to set up a basic error resolution mechanism to display the error on the screen. In order to test-drive it, try removing one of the opening "item" tags from the XML document used in this example and accessing the JSP page again through your browser.{mospagebreak title=Skinning A Cat, Technique Two} How about another example, this one utilizing a different technique to format XML into HTML?
Here's the XML file I plan to use - it's a simple to-do list, with tasks, priorities and due dates marked up in XML.
<?xml version="1.0"?>
<todo>
<item>
<priority>1</priority>
<task>Figure
out how Xerces works</task>
<due>2001-12-12</due>
</item>
<item>
<priority>2</priority>
<task>Conquer
the last Quake map</task>
<due>2001-12-31</due>
</item>
<item>
<priority>3</priority>
<task>Buy
a Ferrari</task>
<due>2005-12-31</due>
</item>
<item>
<priority>1</priority>
<task>File
tax return</task>
<due>2002-03-31</due>
</item>
<item>
<priority>3</priority>
<task>Learn
to cook</task>
<due>2002-06-30</due>
</item>
</todo>
As with the previous examples, this has two components: the source code for the
Java class, and the JSP page which uses the class. Here's the class:
import java.util.*;
import java.io.*;
import org.xml.sax.*;
import org.apache.xerces.parsers.SAXParser;
import
org.xml.sax.helpers.DefaultHandler;
public class MySixthSaxApp extends DefaultHandler
{
private Writer out;
private String ElementName = "";
// define a hash
table
to store HTML markup
// this hash table is used in the callback functions
//
for start, end and character elements ("priority" only)
private Map StartElementHTML
= new HashMap();
private Map EndElementHTML = new HashMap();
private Map PriorityHTML
= new HashMap();
// constructor
public MySixthSaxApp (String xmlFile,
Writer out)
throws SAXException {
this.out = out;
// initialize StartElementHTML
Hashmap
StartElementHTML.put("todo","<ol>n");
StartElementHTML.put("item","<li>");
StartElementHTML.put("task","<b>");
StartElementHTML.put("due"," <i>(");
//
initialize EndElementHTML Hashmap
EndElementHTML.put("todo","</ol>n");
EndElementHTML.put("item","</font></li>n");
EndElementHTML.put("task","</b>");
EndElementHTML.put("due",")</i>");
//
initialize PriorityHTML Hashmap
PriorityHTML.put("1","<font face="Verdana"
color="#ff0000"
size="2">");
PriorityHTML.put("2","<font face="Verdana"
color="#0000ff"
size="2">");
PriorityHTML.put("3","<font face="Verdana"
color="#000000"
size="2">");
// create a Xerces SAX parser
SAXParser
parser = new SAXParser();
// set the content handler
parser.setContentHandler(this);
//
parse the document
try {
parser.parse(xmlFile);
out.flush();
} catch (IOException e) {
throw new SAXException(e);
}
}
// start element callback function
public void startElement (String
uri, String local, String qName,
Attributes atts)
throws SAXException {
try
{
// keep track of the element being parsed
ElementName = local;
//
only call the HashMap table if the element is not the "priority"
element
if(local
!= null && (!local.equals("priority"))) {
// this ensures
that elements
not present
// in the HashMap are handled
// basically,
taking care
of those ugly NullPointerExceptions
if(StartElementHTML.get(local)
!= null)
{
out.write((StartElementHTML.get(local)).toString());
}
}
} catch
(IOException e) {
throw new SAXException(e);
}
}
// cdata
callback function
public void characters(char[] text, int start, int length)
throws SAXException
{
try {
String Content = new String(text, start,
length);
if (!Content.trim().equals(""))
{
if(ElementName != null) {
// if the element name is not "priority",
then display content
if(!ElementName.equals("priority"))
{
out.write(Content);
}
else {
// if it is the "priority"
element
// get the HTML tag from
the Priority Hashmap
// this
defines the color for tasks with different
priorities
if(PriorityHTML.get(Content)
!= null) {
out.write((PriorityHTML.get(Content)).toString());
}
}
}
}
}
catch (IOException e) {
throw new SAXException(e);
}
}
// end element
callback function
public void endElement (String
uri, String local, String qName)
throws
SAXException {
try {
if(local
!= null && (!local.equals("priority")))
{
if(EndElementHTML.get(local)
!= null) {
out.write((EndElementHTML.get(local)).toString());
ElementName
= null;
}
}
} catch (IOException e) {
throw new SAXException(e);
}
}
}
This is much cleaner and easier to read than the previous example, since it uses
Java's HashMap object to store key-value pairs mapping HTML markup to XML markup. Three HashMaps have been used here: StartElementHTML, which stores the HTML tags for opening XML elements; EndElementHTML, which stores the HTML tags for closing XML elements; and PriorityHTML, which stores the HTML tags for the "priority" elements defined for each "item".
These HashMaps are populated with data in the class constructor:
A string variable named ElementName is also used to store the name of the element
currently being parsed; this is used within the characters() callback function.
private String ElementName = "";
Now, when an opening tag is found, the startElement() callback is triggered;
this callback function uses the current element name as a key into the HashMap previously defined, retrieves the corresponding HTML markup for that element, and prints it.
// start element callback function
public void startElement (String uri,
String local, String qName,
Attributes atts)
throws SAXException {
try {
//
keep track of the element being parsed
ElementName = local;
//
only call
the HashMap table if the element is not the "priority"
element
if(local !=
null && (!local.equals("priority"))) {
// this ensures
that elements
not present
// in the HashMap are handled
// basically,
taking care
of those ugly NullPointerExceptions
if(StartElementHTML.get(local)
!= null)
{
out.write((StartElementHTML.get(local)).toString());
}
}
} catch
(IOException e) {
throw new SAXException(e);
}
}
Note the numerous checks to avoid NullPointerExceptions, the bane of every Java
programmer on the planet.
With the opening element handled, the next step is to process the character data that follows it. This is handled by the characters() callback, which performs the important task of displaying the element content, with appropriate modification to the font colour depending on the element priority.
// cdata callback function
public void characters(char[] text, int start,
int length)
throws SAXException {
try {
String Content = new String(text,
start, length);
if (!Content.trim().equals("")) {
if(ElementName !=
null)
{
// if the element name is not "priority", then display content
if(!ElementName.equals("priority"))
{
out.write(Content);
} else {
// if it is the "priority"
element
// get the HTML
tag from the Priority Hashmap
// this defines
the color for tasks with
different priorities
if(PriorityHTML.get(Content)
!= null) {
out.write((PriorityHTML.get(Content)).toString());
}
}
}
}
}
catch (IOException e) {
throw new SAXException(e);
}
}
Here, the priority of the task is used to retrieve the corresponding display
colour for from the PriorityHTML HashMap, and the content is then printed in that colour.
Finally, the endElement() callback function replicates the functionality of the startElement() callback, closing the HTML tags opened earlier.
// end element callback function
public void endElement (String uri, String
local, String qName)
throws SAXException {
try {
if(local != null &&
(!local.equals("priority"))) {
if(EndElementHTML.get(local) != null) {
out.write((EndElementHTML.get(local)).toString());
ElementName
= null;
}
}
} catch (IOException e) {
throw new SAXException(e);
}
}
}
And here's the JSP page that uses the class above:
<%@ page language="java" import="java.io.IOException" %>
<html>
<head>
</head>
<body>
<h1><font
face="Verdana">My Todo List</font></h1>
<% try {
MySixthSaxApp mySixthExample
= new
MySixthSaxApp("/www/xerces/WEB-INF/classes/todo.xml ",out);
} catch (Exception
e) {
out.println("<font face="verdana" size="2">Something bad just
happened:
<br><b>" + e + "</b></font>");
}
%>
</body>
</html>
And here's what it looks like:
Because I've used HashMaps to map XML elements to HTML markup, the code in the example above is cleaner and easier to maintain. Further, this approach makes it simpler to edit the XML-to-HTML mapping; if I need to add a new element to the source XML document, I need only update the HashMaps in my class code, with minimal modification to the callbacks themselves.